Watson Rock, Paper, Scissors

A simple hands-on activity to let kids train a machine learning classifier to be able to play Rock, Paper, Scissors.

Screen Shot 2016-06-26 at 13.59.20

I’ve written and spoken before that I think we should do more to introduce children to the idea of machine learning. And I’ve tried introducing my two kids to it, such as by making a Code Club-style game with them: we built a system to play Guess Who, that they trained both to understand what you say and to recognise the characteristics of faces from photos.

This weekend, we tried out another idea – Rock, Paper, Scissors from a web app, using the web cam to see your moves, and training a system to recognise your hand signs.

DSC06146

In many ways, this was simpler than the Guess Who project, and one I would’ve tried before if we’d thought of it! I’ve made getting the system to choose it’s next move very simple, as it just chooses one of the three options at random.

The machine learning element comes from the fact that I got them to train a custom image classifier to recognise what a ‘rock’, ‘paper’ and ‘scissors’ hand sign looks like.

DSC06153

Last night I hacked together a quick single-page training web app for them to use. Unlike with the Guess Who game, where we worked on it together to come up with the project, this time I made it myself to see how they’d get on with using it. (I was thinking that if it went well, I’d try using it with one of my school groups).

They got off to a good start… although I hadn’t counted on how much time Grace would spend checking out her hair once she saw the web cam video. 🙂


Video of the kids getting started

I’ve got it so that you can take photos from the web cam by clicking on a button, and put the photo into one of three training groups – one each for rock, paper and scissors.

DSC06150

My hope was that the three hand signs – a fist, a flat palm, and two fingers – are distinct enough that an image classifier could quickly start to distinguish between them.

Although I should probably add some overlay to the live video to suggest where to put your hand, how close to the camera, etc. as their initial attempts were fairly inconsistent.


Video of their first attempt at training

As before, I’m using one of the Watson developer APIs available in Bluemix, called Visual Recognition.

To train it, you just need to upload zip files, where each zip file contains examples of photos of something you want to recognise.

In this case, we want to upload three zip files – one of photos of “rock”, one zip file of “paper” photos, one zip file of photos of “scissors”. And you can do all of that in a single HTTP POST, so the code behind the training app is very simple.

DSC06152

That said, I should still share the code.

It was hacked together in an evening, so it’s a complete mess. But I’ll try and find some spare time this week to tidy it up, and put the code somewhere in case it’s helpful. Some of the code for driving the webcam, zipping up the photos, and uploading them, might be useful to someone.


The first test

To test it, I made a simple game. I’ve got it keeping score to see how many moves you win against the random choices the game makes. Although Grace seemed more interested in counting the number of times the game correctly identified our moves. It wasn’t perfect!

(The “Watson’s move” images used in the game bit were made by Faith. She took photos of her own hand using an iPad, and did the weird grid background effect using PopAGraph. It wasn’t quite what I expected, but I think it looks kinda neat.)

Screen Shot 2016-06-26 at 13.57.16

As with the Guess Who activity, what most interested me was the girls’ reactions to the behaviour of the ML system. I was surprised that even after last time, their initial assumption was still that they could take one photo of each, and that would be enough. (Another reason why it’s worth doing a few of these activities to reinforce it).

But with a little nudging, they quickly saw how that the more examples they gave, the better the system performed.

Other than that, a lot of the lessons they learned were reminders of what we talked about before about what it’s like doing supervised learning projects.


The second attempt… another test after more training this time

One thing I was impressed with was when they thought that the training would be more effective if they used an actual rock, actual scissors, and a sheet of paper. Doing that made the accuracy a hundred times better. That makes sense, as they are much more distinct than all photos of hands, so that was a neat idea!

And that’s pretty much it. My second experiment at getting the kids to play with machine learning seemed to work pretty well.

I’ve put the app I made for them up on Bluemix so you’re welcome to give it a try it you like. It’s at https://watson-rock-paper-scissors.eu-gb.mybluemix.net.

You’ll have to get your own Visual Recognition API key to use it, though, but that supports a free trial, so hopefully that won’t put you off!


Third time lucky? Another test


Update 1:

A couple of friends pointed me at another recent IBM rock, paper, scissors project when I mentioned planning to do this.

It looks like a great project and is well worth a look – they’ve gone into using Apache Spark to create a system able look for patterns in how people play rock-paper-scissors and learn strategies to win the game.

More recently, they’ve even gotten a NAO robot to play the game for them!

I wasn’t aware of this work before, but decided to go ahead with our project anyway. Partly because my focus was different for this, but mostly because I think that the Agonies of Parallel Creation should never stop us from creating and sharing stuff. There’s no new idea under the sun, so if we wait for an idea that no-one has ever got close to, we’d stop creating anything.

That said, it’s a particularly surreal coincidence in this case, as this is not only an IBM project, but the author of that post is the great David Taieb who I used to work for when he worked on Watson! Small world.


Update 2:

Yes, I know you can see my API key in the video. But don’t worry. Bluemix makes it easy enough to revoke credentials so I’ve deleted that API key. Doing that was much quicker than learning enough video editing to be able to mask it out. 🙂

DSC06149


An introduction to machine learning with Guess Who

I tried introducing my two kids to machine learning by helping them make a game this week.

In this post, I’ll try and explain why, how we did it, and how it went. And if you make it all the way to an end, I’ve got some videos and a link to a demo to show you what we made.

Why

I think we need to introduce the basic concept of machine learning to children.

I think the current approach to introducing coding using things like Scratch aren’t enough. This isn’t to say Scratch isn’t great (I’ve been running a Code Club every week for the last couple of years, delivered almost entirely using Scratch, so I’d be the last person to say it isn’t a fantastic tool). It lets you snap together blocks representing actions to teach the programming mindset of getting a computer to do something by you breaking the task down into a series of steps.

I think we need to add to this with something that introduces the model of machine learning – getting a computer to do something by training it with examples of doing that task.

I’ve been saying this for a while – I gave a talk about it at an education conference last year, I’ve written about it here before, and it was the theme of a lecture I gave at a science society in London last month.

This week is half-term and I have the week off work, so I thought I’d finally spend a bit of time trying it out by experimenting on my own two daughters (Faith and Grace, who are aged 7 and 11).

In Code Club, I mostly try to introduce programming concepts by helping the kids to create games. Sticking with what seems to work, I’ve helped them to make a game by training an ML system how to play it.

How

We’ve been making a sort of Guess Who? game – a game that I’ve played with the kids for years and that they know inside out.

Guess-Who-Closeup.jpg

The idea is that the computer shows a bunch of faces and chooses one of them at random. You have to work out which one it’s chosen by asking Yes/No questions about their appearance.

I introduced this to Grace and Faith mostly using analogies and anthropomorphising.

“We need to teach the computer how to play the game instead of telling it how to”.

“We’ll do it the same way that you teach a small child to do something, not by telling it the series of steps to take but by showing it examples of how it should be done, over and over, until it learns how to do it”.

And so on.

I went with a couple of opportunities for trying out machine learning approaches.

Understanding the question

As a player, we let you ask questions in natural language. You’ve just got a text box to type in anything you want to ask. In practice though, there’s a reasonably predictable set of questions you might ask – it’s a very constrained domain. So it lends itself well to using a text classifier.

I got the kids to make a list of all the types of questions they ask in a Guess Who game. These were the classes we’d train the system with (e.g. “has brown hair” or “is wearing a hat”). Then for each class, they had to come up with as many ways as they could think of to ask that question.

the kids using Watson NLC

We used this to train IBM Watson Natural Language Classifier.

I chose it mostly because I’m familiar with it, it’s a hosted readily available service so there was nothing to install or configure, it has a simple web interface for training that the kids could use to enter the classes and texts they thought of, and for this sort of kicking-the-tyres-with-minimal-usage it’s free.

With a little training, they had a trained classifier that would be able to take a question from the player and map it to one of the classes they’d come up with for the questions they ask when they play Guess Who.

Answering the question

The next step was to train the system to be able to answer the questions. We needed it to be able to recognise visual characteristics in a photo. Again, this is a constrained domain, although less so than for classifying questions. But still, this was basically a classifying task, albeit with the extra challenge of using images.

We looked for a collection of face images that we could use, both for training, and for the game itself. We initially started trying out Labeled Wikipedia Faces, but ultimately settled on using Labeled Faces in the Wild – a set of over 13,000 photos of faces.

Grace, preparing the training for Visual Recognition

I got the kids to group examples of the faces based on the characteristics they’d come up with in their list of text classes. We just used the built-in OSX Finder to do this.

One folder open on the left with all the unsorted faces in, and a bunch of folders open to the right – e.g. “short-hair”, “long-hair”, “bald”, “hat”. They scrolled through the unsorted pictures, and trained by dragging the photos across into the relevant folder.

We used an IBM Watson API with this, too : IBM Watson Visual Recognition

Again, I chose it because I’m familiar with it, and it’s a hosted service I didn’t need to install or run, and while it’s in beta it’s all free to use.

Putting it all together

While they did that, I hacked together up a quick game to drive it. They were in charge of training, I wrote a simple REST API to make the requests to the Watson APIs, and a UI to let the player interact with it.

We had our game!

How it went

I asked them what they’d learned from doing it, and they came up with things like:

“You need a lot of training data”

When I explained how we’d train the visual recognition classifiers to recognise characteristics in photos, Faith said “We could take photos of each other and use those!”. I said that we’d need more than that, so she said “I could take photos of my friends as well!”.

But we needed so many more than that. We started by looking to Wikipedia (always a good starting point for getting data!), before eventually settling on the LFW set.

Getting them to help me find the training data gave them an idea of the sort of scale you need for these sorts of projects, as well as giving them an insight into ideas like gathering existing data rather than trying to manually create it.

“The more training you give it, the better it gets”

We did this project off-and-on throughout half-term week rather than all in one sitting. And we re-ran the training after each go. It was really clear the way that the results improved the more that we did.

I’m a big fan of learning by doing, and I think that seeing that for themselves was more effective than if I’d just told them. Trying it out with virtually no training, and seeing that it’s rubbish. Then trying it out with a bit of training, and see it start to get better. Then trying it again after a few days and see it really improving – there was a definite light bulb moment.

“Training is fun at first, but can get a bit boring after a while”

This was more an issue with the visual recognition training rather than the NLC text classifier training. Even after the first afternoon they’d already come up with a classifier that was doing a decent job of most questions. But grouping the photos into the training sets took much longer. And to be honest, it really needs more training than they did if you wanted to turn this into a proper demo or game – I let them do enough to get the idea, but stopped them before it became a miserable chore!

But this was a positive thing. As a takeaway, that’s a really valid and valuable insight. I’ve seen commercial ML projects fail to recognise this at the outset! Training ML systems is a repetitive, manual and very time-consuming task that does get boring for the people involved.

We talked about ways that people try and get around that, like trying to turn it into a game. Grace thought that if we did this as a school activity, with all her class helping out with the training, then split between 30 kids, it wouldn’t take as long.

All of these points – the amount of time and work needed to train, approaches like getting a large number of people to train, etc. – these are all great lessons about the practicalities of using machine learning.

“It’s a bit like magic”

If you had to describe the steps involved in working out if a photo included a face that has a moustache, that would be pretty complicated. But dividing 100 photos into two groups – faces with a moustache, and faces without a moustache – that’s easy.

There were a bunch of things they talked about here that was really positive – that doing this is easy, and you don’t need you to understand the deep detail behind how it works. With Faith (my youngest) we didn’t go into too much detail about how the training works, but Grace and I did find some videos on YouTube that explained the principles behind deep learning which were interesting.

More generally, what I wanted them to learn was that computers can spot patterns in data, without you needing to be able to specify rules that they should follow. And I think that’s a more effective lesson as something they saw and did for themselves rather than just reading or hearing about it.

The result


https://youtu.be/oguzXlRT4NQ

I’ve put what we came up with on Bluemix at http://guesswho.eu-gb.mybluemix.net

I’ve also put some more videos below to show us trying it out.

As I explained above, it’s not really “finished” – I think it needs more training, and the UI was hacked together in a hurry. But even as a quick half-term project, I think it already shows a glimmer of what is possible.

More importantly, it shows that you can give kids a hands-on introduction to machine learning.


https://youtu.be/1AkSVBzE5io


https://youtu.be/6ay7MqhW53g


https://youtu.be/Z3UReJzzY4Q


https://youtu.be/GNcz_D62_0w


The skills implications of Cognitive Computing

STEMtech is a conference about the education of science, technology, engineering and maths. The attendees are an interesting mix of people from education and policy makers, as well as people like me from industry.

This year, they invited me to do a talk. My slides are shared but they’ll make no sense by themselves. What follows is roughly what I think I said.

Today I’m going to be talking about how we teach children to use computers. I’m not going to be talking about the current provision for this. I’m not going to be talking about what we necessarily need to be teaching children today, or maybe even tomorrow – partly because that’s already being well covered in the rest of the agenda today.

Instead, I’d like to use this session to think a little further ahead.

Big changes are coming in computing, and I think we should start thinking about how we’ll need to respond to them.

big trak. I loved this thing.

We used to have this at school when I was a kid, and I think I remember those lessons more strongly than any other lesson I did at school.

It had a keypad on the back. We’d program in a series of instructions – drive forwards this much, turn this much, fire the laser!

This was computing to me when I was young. This was robotics. This was cool.

Not all schools used big trak, but a lot of schools used something like this.

Variations of logo robot turtles have been widely used by many schools for many years, and are essentially the same thing – programming a series of instructions into a robot that follows them by driving around on the floor.

Fast forward to today, and I get to spend my Monday afternoons running a Code Club that I started in my local primary school: an after-school programming group. I get just an hour or two a week to see how kids learn about computers, what they think of them, how they approach them.

We use Scratch in Code Club, just as the school are now doing in lessons. It gives a visual, drag-and-drop way to build a sequence of instructions, which are followed by sprites moving around on the screen. This is how we explain programming and computers to children.

What strikes me doing this is how similar it is to what I was doing as a kid. It’s not exactly the same, but if you squint a bit and stand back a bit, it feels pretty similar. They’re both about showing children how to break an activity down into a series of steps.

And that’s to be expected. Programming itself is very similar today to what it was when I was at school. In some ways programming today is the same as it was five, ten, twenty years ago – it’s still about getting machines to perform tasks by getting people to define the sequence of instructions to follow. It’s only natural that this will be reflected in the way we introduce this to children.

But this is going to change.

We’re starting to see a future of computing that is going to be different, and we’re going to need new metaphors and new ways of introducing it and explaining it.

Computers have changed before.

We group the evolution of computers into eras, and they start in the 1800s in the era of tabulating machines.

I’m talking about systems like punched card machines.

Machines that could count more things, more quickly and more accurately than people possibly could.

Machines that were considered revolutionary because they enabled the US census in 1890 to be analysed before they needed to start the 1900 census.

Machines like the Hollerith tabulating machine were the computers of their day. This was the state of the art in technology.

Teaching computing then would’ve meant teaching about the capabilities of these machines. I don’t just mean the mechanics of feeding the cards into the machine, although that would’ve been part of it. I mean about looking at problems as collections of things that can be counted.

The era of tabulating machines was followed by the era of programmable machines, starting around about the 1940s with machines like Colossus.

Shifting between eras isn’t instant. We didn’t just turn off all the punched card machines one day and start using programmable computers. It was gradual, there was overlap – both in terms of there being a time when we were using both kinds of machine, but also in the way that some early programmable computers included elements of the types of systems that came before. We transitioned over time from one way of thinking about computers to another.

What was different about programmable computers was that you didn’t have to just give the data to a machine to process, you could also give it the instructions that you wanted it to carry out on that data.

When I think of early programmable computers, I think of the machines of the 1950s and 1960s. Huge machines that would fill a room.

But this is still the era we’re in today. Our computers today are faster, and smaller, and more powerful – and better in so many other ways. But fundamentally, architecturally, conceptually – our computers today work to the same principles of these early systems.

We think the third era of computing will be the era of cognitive machines. In the same way that the transition to programmable computers wasn’t instant, neither will this be. So it’s debatable whether we’re already in an era of cognitive machines, whether we’re starting to see early signs of it, or whether it’s something that we’re anticipating. Regardless of the exact date it starts, this is what we think will characterize the next generation of computers.

I should clarify what I mean by cognitive computing, because it’s perhaps not a term that has got mainstream awareness yet.

I’ve got a couple of examples to help me explain it.

What is 2+2?

If I give you the instruction or the question 2+2, what answer would you give me?

I assume that most of you would answer 4. Two plus two equals four.

That’s certainly the answer I’d expect from a programmable system – a system that has been hard-coded with instructions to follow, including instructions for how to handle adding numbers together.

But what if my question was in the context of social sciences, in a discussion about family structures. Then I probably would’ve been talking about the family structure of two adults and two children.

Or in the context of automotive engineering, I probably would’ve been talking about a layout of car seats with two front seats and two back seats.

Or in the context of card games, I might’ve meant a poker hand or a poker strategy.

The response I would’ve wanted will have been dependent on the context that my request was in. And this is the kind of behaviour that we would expect from a cognitive computer.

A programmable computer needs to be coded with the instructions to follow and the answer to return, and will always return that answer. Programmable computers are deterministic in this way. A calculator will always give me ‘4’, no matter how many times I ask 2+2.

Cognitive computers will be more probabilistic. They’ll likely return a panel of possible answers instead of a single answer, each one associated with some level of probability that it’s the right response. And this won’t be fixed, but will take the context of the question into account.

By context, I don’t just mean the situational context. It’s also about a knowledge of the things mentioned.

Consider this example, and what you think it means.

Policeman helps dog bite victim.

You’d probably assumes that this means that there is a dog bite victim.

And the policeman helps him.

Policeman helps dog bite victim.

You probably wouldn’t assume that it means that the policeman helped the dog to bite the victim.

But that’s a valid way to parse the sentence.

So that’s not all that you do – you use your knowledge of the things mentioned in the sentence to handle the ambiguity of the English language. You know that the police help people. You know it’d be unusual for a policeman to bite someone.

You use this knowledge to choose between the different possible interpretations of the sentence.

This is also the behaviour we’d expect from a cognitive computer.

Systems that interact with us in our own language. Instead of us having to use machine languages or machine interfaces to work with computers, we’ll work with cognitive computers in our own languages – languages like English.

Systems that are probabilistic rather than deterministic in the way they work and the answers that they return.

Systems that take context into account, and learn how to apply their knowledge to identify the most likely responses.

In 1997, an IBM computer beat the grandmaster Garry Kasparov at a game of chess. We did that as a demonstration of the progress that we’d made in a technical field (in that case, things like massively parallel computing) but also as a way of explaining it’s potential.

We did something similar to demonstrate and explain the progress that we’re making with cognitive computing, this time by entering a computer into a TV quiz show.

Jeopardy is a TV quiz show in the US – a few contestants, buzzing in when they know the answers to questions, and winning cash prizes. It’s less well-known in the UK, but it’s huge in the US – a show that’s been going since the 1960s.

They ask difficult questions. Complex, sometimes cryptic questions, with a variety of grammatical forms.

In 2011, we entered an IBM computer called Watson as a contestant. This was our first attempt at building a cognitive computer, and we wanted to show how it was different.

Watson went up against Brad Rutter and Ken Jennings – two of the best players to have ever gone on Jeopardy. These guys are household names in the US because of their performance on this show – these were the Garry Kasparov’s of the TV quiz show world.

Watson had to compete in the same way any contestant would. This wasn’t a search engine returning hundreds of thousands of possible documents. It needed to understand complex specific questions, and be able to come up with a single specific answer in seconds to be the first to the buzzer.

Some examples of the kinds of questions it got.

This was actually the final question in the show.

It’s talking about “Dracula” – the answer is Bram Stoker.

This is another Jeopardy question, from a round called “Lincoln Blogs”. The answer is “his resignation” – Chase submitted his resignation to President Abraham Lincoln three times.

But you need an understanding of the question, and the contextual knowledge that a resignation is something you submit, in order to get that.

This is a question about Mount Everest – about who was the first person to climb Mount Everest. But it doesn’t say that. It isn’t a single-clause extractive question like “Who was the first person to climb Mount Everest?” which would be much easier to interpret.

Again, answering this question precisely depends on knowing something about George Mallory and what he is known for, and knowing what kind of thing you might be “first” at in this context.

Another Jeopardy question, this time in a round called “Before and After”.

The answer it’s looking for is “A Hard Day’s Night of the Living Dead”

“A Hard Day’s Night” answering the Beatles bit, and “Night of the Living Dead” being the Romero zombie film.

“Before and After” is the clue that they’re looking for something with that overlap between them.

This question is showing it’s age now, given recent events in Cuba.

But at the time, I think the four countries that the US wasn’t getting along with were Bhutan, Cuba, North Korea and Iran. And the question is asking which of those four countries is furthest north.

Answering this question correctly involves needing to recognise that there is a political element here (which countries does the US not have diplomatic relations with) and then a spatial one (which one is furthest north).

By the end of the shows, Watson had not only won, but with a higher score than the two Jeopardy “grandmasters” combined.

Even within the limited constraints of the TV quiz show format, it gave us an insight into what the future might be like.

It showed what interaction with computers in the future will be like. Watson got the question from the quiz show host, understood it, buzzed in and spoke it’s answer.

The quiz show is just one way to try and explain this.

A more recent example is Chef Watson: a system that is learning about food and cooking, and using that to design new dishes and new meals.

You can give it some constraints, like that you’d like a chicken dish, or that you want something like a stir-fry, or that you’d like something influenced by a cuisine like Chinese. And you can tell it how adventurous and surprising you’d like it to be.

And it will design a new dish for you.

This isn’t about building a search engine for existing recipes.

Chefs from the ICE Culinary Institute are working with Watson to create new recipes.

They’ve trained it by giving it a massive amount of recipes to read. We didn’t manually prescribe types of dishes, types of cuisines and so on – we didn’t prescribe what characteristics we think an Indian dish for example would have. Instead, Watson had to learn that there is a type of cuisine like Caribbean and what that means as it came across a range of Caribbean recipes in what it’s read.

It was also given the chemical descriptions of a wide range of ingredients, and trained with a massive range of experiences of people’s reactions to specific flavour combinations.

Watson learned which combinations people liked, and which they didn’t – and identified connections and patterns in the molecular combinations behind that.

It’s about Watson as an assistant in the kitchen – making suggestions and offering ideas.

It’s helping us improve our understanding of creativity. It’s helping us improve our understanding of what is in involved in being creative (something that we traditionally associate with being a very human thing), and explore opportunities for cognitive computing to help with this.

We’ve used Chef Watson in the same way that we used the quiz show: to help explain the potential behind cognitive computing.

Putting Watson and the Chefs onto a food truck and taking them to conferences and events is a way of making it real. Letting people try the food that they come up with is one way to try and start a conversation how cognitive computing is going to be different.

Similarly, Watson has published a cook book of some of it’s recipes.

It’s kind of fun, and not something that I would’ve expected us to be doing a few years ago.

But we’re starting to find ways to explain cognitive computing through giving people tangible experiences.

Cognitive computing is still a very new and emerging concept, so it’s hard to find a definitive definition of what it means.

But I’ve collected a few examples of how it’s being described in technology, industry and academia.

Forrester have described it as computers that learn, computers that can interact with us, and computers that make evidence-based recommendations.

IBM has talked about the way cognitive systems will transform the way people will interact with computers, and highlighted that these systems will draw on knowledge from massive amounts of data.

Gartner, who describe this as the smart machines era, say that this is going to be most disruptive change in the history of IT, and talk about this enabling things that we didn’t think computers would be able to do.

MIT have talked about the collaborative nature of working with systems like these – like I tried to describe about Chef Watson, this is about systems working with people to create things that neither might have done separately.

The British Computer Society have talked about cognitive computing as systems that learn through experiences instead of following a prescribed sequence of tasks, and highlighted that these will handle massive amounts of information.

Sticking with this British Computer Society paper for a moment, they also highlight that there is a skills gap here.

For programmable systems, we need people who can understand what the overall task a machine needs to achieve, and can identify and describe the specific steps needed to do that.

Working with cognitive systems will be different. We need people who can identify the learning and experiential opportunities that a system will need in order to learn how to achieve the task.

There will be a need for a generation of technologists who can work with systems like this.

Preparing Watson for Jeopardy is a good example of this. We didn’t try to pre-empt what questions might come up on the show and pull together a set of answers to look them up in (not that I think such an approach would be feasible or scalable).

Instead, Watson prepared for Jeopardy by reading and extracting an understanding from a wide range of sources. I don’t mean game-show-specific sources. I don’t mean tabular data, or structured data, or data that has been manually prepared to be machine readable. I’m talking about encyclopaedias and dictionaries, newspapers and magazines, books and much more. Hundreds of sources of text – stuff that has been written for use by people.

And it learned how to use this knowledge by playing Jeopardy matches. Lots and lots of Jeopardy matches, to give it the experiences it needs to learn how to use it’s knowledge, and when to use it’s many hundreds of strategies it has under the covers. Some questions are best handled in these ways, while other questions are better handled in other ways.

Watson learned how by playing the game, and got better through sparring matches with other previous champions from the Jeopardy TV show.

Since the TV show in 2011, a big focus for our work with Watson has been healthcare.

Jeopardy was about taking a wide range of general knowledge sources, letting Watson extract a knowledge from that, and then giving it the experiences necessary to learn how to use that knowledge to do the task of playing a gameshow.

After Jeopardy, we started giving Watson medical sources – text books, journals, research papers, treatment guidelines, medical records, and then working with doctors and clinicians to give Watson the experiences necessary to use that knowledge to support doctors and nurses.

We’ve partnered with cancer centres like Memorial Sloan Kettering and MD Anderson to do this. What we need are partners who understand what the system will need to be able to do, and can work backwards from there to identify what knowledge it will need and what experiences the system will need to have in order to use that knowledge to do it.

In many ways, teaching hospitals are ideal partners for us in this because they do this for their medical students. And the metaphor of Watson going to medical school is one that seems to have stuck. It’s not quick – it’s taking a roughly comparable amount of time that it would take a person to go through medical school.

But it’s working. Watson is being used today, albeit at relatively small scales, by doctors and nurses in the treatment and diagnosis of some of the world’s toughest diseases.

It doesn’t have to be so dramatic though. I think cognitive computing will become a part of all of our lives, not just something exclusive to specialists like doctors.

I went to a conference in Twickenham last year, and one of the talks was about a project for a mobile phone retailer, trialling cognitive computing as a way to answer the questions they get from their customers.

I loved the way that he described what they’re doing – working with Watson, rather than using it. And he described it as being like having a new member of staff to train, and needing to identify what that new member of staff would need to read, and what experiences of customer interactions they could give it to teach it how to support them.

Examples of this are all around us. In the same way that the early programmable systems built on the achievements and techniques that had come before them, cognitive computers are building on years of progress in fields like machine learning and natural language processing.

Google Translate is a great example of this.

You put in some text in one language, and it can translate that into another.

Unlike many of the translation systems that came before it, they didn’t build this just by collecting together linguists and getting them to prescribe the instructions for translating every word.

Instead, they trained a machine learning system to be able to do this, using sources like documents from the United Nations. The UN is a great source for this as they produce a lot of documents, and have to translate them into a wide range of languages for all their member nations.

What you’ve got is a large number of examples that this in one language means that in another.

Cognitive computing will need us to approach problems in this way – not trying to come up with all the answers ourselves, but being able to identify how to give a computer the experiences it will need in order to help.

I said that there is going to be a skills gap here, and we’re already starting to see it in the graduates that join us.

We’re starting to tackle that by working with Universities to introduce modules on cognitive computing into their courses, and giving them access to instances of Watson for use in student projects.

But there will come a point where we need to start introducing it earlier, into colleges and schools.

We need to start thinking about how we explain the computers of the future to children.

We need to think about what is the cognitive computing equivalent of Big Trak.

That experience made computers real to me as a kid. It inspired me. It made the concept and the potential come alive. What is going to do that for cognitive computers?

In the same way that systems like Logo and Scratch have given us the way to let children try out and play with the concepts behind programmable computers, we need a way to do this for cognitive systems.

Scratch has it’s palette of blocks to snap together. What is going to be the metaphor to explain systems that need to trained?

We talk about computers that can think. It’s obviously a metaphor, and is true in many ways but doesn’t hold in others. I’m not trying to imply I think these are going to be systems with a conciousness any time soon.

But how far can we take the metaphor? As we need people who can work with systems that think and learn, this needs to take into account the way that computers learn.

At the risk of pushing the metaphor too far, we need an approach built around the psychology of these emerging systems.

I started by saying that big changes are coming in computing. Tomorrow’s children are going to have amazing, exciting, powerful systems to play with, and they’re going to grow up and use them to achieve fantastic things.

But first we’re going to need to figure out how to get them started.


How to use the IBM Watson Relationship Extraction service on Bluemix

Before Christmas, I wrote about how I used the Watson Relationship Extraction service on Bluemix to pick out the things mentioned in news stories, as part of a mobile app we built on a hackday. I’d still like to do something more with that app, but in the meantime I should at least share how I did the Relationship Extraction bit.

From the official doc for the service:

From unstructured text, Relationship Extraction can extract entities (such as people, locations, organizations, events), and the relationships between these entities (such as person employed-by organization, person resides-in location).

This is provided as a hosted service on IBM Bluemix where any developer can sign up and give it a try.

It’s available as a documented REST API, but as part of using it in the hackday, I needed to write a bit of code around that, just to prepare the request and parse the response. I think it’ll save me time to reuse this the next time I want to build something with the API, so I’m sharing it as a standalone package.

In this post, I’ll walk though how you can use it, with a small app that grabs the contents of a BBC News story and picks out the names of people mentioned in the story.

First, a simpler example. Consider this exciting text:

Dale Lane works as a developer for IBM. He started in 2003. Dale lives in the UK, in a town called Eastleigh. Before that, he was a student at the University of Bath.

A few lines of Javascript are enough to run that through the service.

var text = '';
var watson = require('extract-relationships');
watson.extract(text, function(err, response) {
    // response has got all the info  
});

The full contents of response is in a gist if you want to see it, but I’ll show just a few examples here to give you the idea.

screenshot

It has picked out all of the references to me, recognising that they are all describing a person, and that ‘developer’ is my occupation.

The ‘begin’ and ‘end’ numbers tell you where in the text each bit was found.

screenshot

It’s picked out the reference to IBM, and recognised that this is a name of a commercial organisation.

screenshot

It’s recognised that ‘2003’ is a reference to a date.

As well as identifying those entities and many others, it’s also picked out the relationships between them.

screenshot

For example, it’s identified the relationship between me and IBM.

screenshot

And the relationship between me and my old University.

I’ve written a more detailed breakdown of what is contained in the response including how to find out what each of the fields mean, and what the different possible values for each one are.

That’s the basics with a few input sentences. Next, we start throwing a lot of text at it.

In about thirty lines of Javascript, you can download the text from a news story on the BBC News website, and pick out the names of all of the people mentioned in the story.

If you run that simple example, you get the list of people that are included in the story text.

Where it starts to get interesting is when you combine this with other sources and APIs.

For example, once you’ve picked out the names of people from the story, try looking up their profiles on Wikipedia, and finding out who they are.

Or, instead of people, pick out the names of places from news stories, and use a geocoding API to plot them on a map. (There are geocoding services available on Bluemix, too, if that helps.)

Hopefully you can see how you could start to use this in your own apps.

Finally, some practical points.

How do you install the package I’ve shared so you can use it?

npm install --save extract-relationships

How do you configure it for the Watson Relationship Extraction Service?

The API we’re using is an authenticated service hosted in IBM Bluemix, so there is a tiny bit of config you need to do first before you can use it.

If you’re running your app in Bluemix, then there isn’t much to do. Add the Relationship Extraction service to your app from the Bluemix dashboard, and the endpoint and credentials will automatically be provided and should just work.

If you’re running your app on your own machine, there are a couple of extra steps instead.

Go to Bluemix. Sign up for an account if you haven’t already got on

screenshot

From the dashboard, create an app. You need something as a placeholder to bind the Relationship Extraction service to, even if you don’t use it.

screenshot

Create a web app and give it a name.

screenshot

Add a service.

screenshot

Choose the Relationship Extraction service from the group of Watson services.

screenshot

Click on ‘Show Credentials’. Everything you need to configure your app is in here. You need the url, username and password.

screenshot

Copy this into an options object like this:

var options = {
    api : {
        url : 'https://url.of.your.watson.service...',
        user : 'your-watson-service-username',
        pass : 'your-watson-service-password'
    }
};

To reiterate, this isn’t the username and password that you use to sign in to Bluemix. It’s the username and password specifically for this service that Bluemix has generated for your app.

It’s not a good idea to hard-code passwords in your code, so I’d suggest putting them outside of your app and grabbing them in when needed. Environment variables are an easy way to do this, and are what I’ve done in the few samples I’ve written.

That’s all there is to it.

I’ve put more info on github with the source for how I’m using the API.


Watson News Companion

newscompanion screenshotWe recently ran a hackathon at work: people within IBM were invited to try building a mobile app aimed at consumers using Watson services. It was a fun chance to try out some new ideas, as well as to build something using our APIs – dogfooding is always a good thing.

I worked on a hack with David which we submitted on Wednesday. This is what we came up with, and how we built it.

The idea

A mobile app that will help users to digest the news by explaining references in stories and providing greater context.

Background

It’s difficult to find the time nowadays to properly read and understand what’s going on in the world. We rarely have the time to sit and read through a newspaper. Instead, we might quickly read news stories online from our smartphones and tablets. But that often makes it difficult to understand the broader context that a story is in. There might be references in the story to people, places, organisations or events that are unfamiliar.

Watson could help. It could be an assistant as you read the news, explaining unfamiliar references and the broader context.

Features

Our Watson News Companion demo is a mobile news reader app that:

  • anticipates questions and suggests areas where it can help improve understanding
  • provides answers to questions without needing the users to lose their place in the story
  • allow the user to dig deeper with their own follow-up questions


A video walkthrough of the hack

Implementation

The hack was built as a mobile web app using the MEAN stack: using express as the framework on a Node.js platform, storing some information in a MongoDB and building the UI with AngularJS.

It uses RSS feeds from news websites to fetch content, which are shown in a simple newsreader app built using Ratchet.

The contents of the story is run through the Watson Relationship Extraction API to pick out the people, places, organisations, and other entities mentioned in the story.

The API output includes co-references, to identify the multiple mentions of the same entity. These are combined and reviewed, and together with the type of the entity are used to generate likely questions about the entity.

These questions are sent to the Watson Question and Answer API. For questions which are returned with answers with a high confidence, annotations are added inline to the news story. Pressing the annotation brings up a sliding panel at the bottom of the screen with the answer to the question. The links and footer annotations are built using bigfoot.

Every screen in the app also includes an “Ask Watson” button which lets the user enters any free text question to let them dig deeper into what they’ve read.

Could you build this?

This was a proof-of-concept built in a hurry, so we’re not calling this a finished app. But everything we used is freely available – both to people inside IBM and the public.

We developed on an instance of Bluemix (our Cloud Foundry-based development platform) available internally within IBM. You can sign up for free to a public instance of Bluemix at bluemix.net.

The technologies used to build the hack are all freely available : Node.js, MongoDB, AngularJS, Ratchet, bigfoot, jQuery.

A beta version of the Watson Relationship Extraction API is freely available for apps hosted on Bluemix.

A beta version of the Watson Question and Answer API is freely available for apps hosted on Bluemix. But this is a demo instance of Watson that has only read a small number of general healthcare documents. That’s not a useful corpus for our hack, so to record our demo we stood up our own instance – using an untrained instance of Watson which we gave a small subset of Wikipedia to read. We used the Question and Answer API on this instance instead of the Bluemix one. For people outside IBM to do this, they need to sign up to join the Watson Ecosystem. This is also free, but there are criteria for who is eligible at this point, and an application process to go through.


Kids should learn to code

Does a five-year-old need to learn how to code?

A couple of weeks ago I was interviewed by the BBC. In a fairly long phone call, I either rambled inanely or provided detailed and nuanced answers in context. That depends on your point of view.

Either way, obviously not a lot of it could make it into their story, as they really only needed a few quotes. So I thought I’d put more of what I said here.

The background for the story was the changes to the UK school curriculum which means that all kids are being taught to code. And the basic premise for the piece was that as we’re “entering an era when computers are actually beginning to teach themselves” that this is unnecessary and that coding itself is becoming an outdated skill.

This is a summary of what I tried to say…

Learning to “code”

It’s useful to start with some context. When we talk about teaching kids to “code” we don’t just mean teaching them how to write lines of code – it’s broader than that. Some criticisms of this initiative seem to be arguing against five-year olds needing to learn where to put semi-colons, which is missing the point.

From what I’ve seen, it’s an umbrella term that covers a range of activities such as:

Logical thinking and problem solving

Teaching kids how to understand a description of a problem, identify a solution, and describe that solution by breaking it down into a series of steps.

As kids get older this can be framed as how to write an algorithm. But it’s something that can be started even at Faith’s age (6) and without needing to touch a keyboard. That’s not new – how many developers have had to answer the interview question “describe how to make a cup of tea”?

You don’t need to learn programming language syntax to start getting your head around this, and I would argue it’s a vital skill to develop in life, even if you don’t become a coder.

Technological creativity

We need to do more than teach children how to use the tools that they have today. We need to encourage an ethos from an early age that we don’t have to be passive users of technology.

It’s about teaching kids how to think of and how to approach technology. They don’t have to think of it as a black box that must be used as-is, but as something that they can remix and tweak and modify and change and create. It’s about an attitude of looking at technology as something that they can make do what they want to do, as opposed to use the way someone tells them they should.

This is what I love about running my Code Club. Instead of kids playing a random Flash game they find online, they can make a game themselves, the way they want it to be. If they want it to be faster, slower, bigger, smaller, a different colour, move differently: they are in control. It’s not fixed, they can make it do and behave the way they want it to. And if they realise that they can do that with technology, it’s a real light-bulb moment.

We need kids to have this mindset so they will grow up able to imagine the next wave of innovations. Saying that we don’t need this because we can delegate it to the computers we have today really feels to me to be missing the point. Cognitive computing holds exciting promise and potential but it does not mean “we won’t need to be creative any more, the computers will do that for us, too”.

Coding becoming “outdated”

Leaving aside this bigger picture, is coding itself a useful skill to learn. Is coding going to become outdated?

I don’t think so.

Part of this argument seemed to be “what is the point of teaching kids <insert-name-of-programming-language-here> because by the time they grow up it will be obsolete?”

Programming languages stick around longer than people think – there are people still making a living writing C and maintaining COBOL. (We’re normally after good Prolog people, too!)

But more importantly, a lot of what you learn in one language is transferable. Every time I’ve started working in a new programming language, I’ve built on the basic concepts I already know from others. Maybe we’ll teach children a programming language that isn’t the most widely used language when they’re older. But that doesn’t mean learning the underlying ideas will have been a waste of time.

The argument also seemed to be that not just any particular language, but coding in general will become obsolete. I’m not convinced by this.

What we mean by coding may be different in twenty years to what we mean today. In fact it probably will be. Coding will evolve. It always has, and I’m sure it will continue to.

Even just looking at my personal coding history, you can see that evolution. Writing in assembler (where I was moving data in and out of registers) was different to writing in C. And writing in C (where it wasn’t just about what I wanted it to do functionally, but also doing my own memory management) was different to my coding today in Java.

A big difference is in the level of abstraction. They all involved describing to the computer something that I wanted it to do. But the level of abstraction I’m able to use to describe it has changed.

I’m sure this is a trend that will continue. New programming languages will get higher and higher level. Future programming languages will give us ways to describe what we want with higher levels of abstraction. And maybe that will look closer to natural language than what we have today (well-written Java is already closer to being readable by a lay-person than assembler). Maybe it will be something like a Controlled English language that feels more like describing what you want to another person.

But that won’t mean that coding has become obsolete, just that it will have evolved as it always has.

The need for people who can understand a problem, and describe to a computer how to solve it, will remain – whatever language they use and whether that language looks like “code” as we understand it today.


Talking about IBM Watson (again)

As I mentioned in May, I was lucky to be able to go to Thinking Digital this year and talk about what we’re doing with Watson.

I’ve just noticed that they’ve made a video of my talk available. I haven’t dared watch it (does anyone like watching videos of themselves?), but I figured I should share it anyway!


Thinking Digital 2014

This week I went up to Newcastle for Thinking Digital.

It was the seventh Thinking Digital, but my first.

I’d seen a bunch of references to it being the UK’s answer to TED, the tickets aren’t cheap, videos from previous years look slick and professional, it’s held in The Sage which is a hugely impressive venue, they manage to get a great line-up of speakers, and the logistics in the run-up to the event were more organised than any event I’ve been to before.

So… I was expecting a cool and geeky, if faceless, serious, formal, and intimidating event.

I’d read it completely wrong. It’s absolutely a professionally run event. And there was no shortage of cool geekiness. But, more than that, the organizer, Herb Kim, has created a real sense of community in it. There’s a feeling of almost familial warmth amongst attendees who come year after year after year.

And they do it without being too cliquey. Everyone I spoke to was very friendly and welcoming, which made the few days a lot easier for an introvert like me. A few days being surrounded by and trying to talk to and socialise with several hundred smart brilliant people is the kind of thing I normally find hugely draining and more than a little daunting. But the crowd at TDC make it easier than most.

They value their time there, too. More than one person told me they’d paid for their own ticket and expenses to attend. I’m used to corporate-run conferences where everyone is paid for by their employer, or barcamps where people moan about being asked for a five pound deposit, so this surprised me.

The talks made for a fascinating and thought-provoking couple of days. I can’t do them justice here (when videos of the talks are available I’ll embed/link them here instead) but I want to give an idea of what the programme was like.

Jeni TennisonOpen Data Institute
Talked about the potential impact of open data on society, giving examples of how open data could be used to inform and widen access to debate.

Maik MaurerSpritz
Demonstrated their speed-reading technology – streaming one word at a time in a fixed place, for fast reading on mobile and wearable devices.

Gerard GrechTech City
Talked about the role of Tech City as a feedback loop between Government and the tech community.

Meri WilliamsChromeRose
Talked about the lessons that people managers could learn from artificial intelligence in how to inspire, motivate, and enable geeks to achieve great things.

Aral Balkanindie Phone
Gave an impassioned and stirring talk entitled “Free is a Lie” about the conflict between advertising-led business models, and user’s privacy and other interests.

David Griffithsfoam
Talked about using his background in the video game industry to combine crowd-sourcing and gaming to perform impressive citizen science projects.

Chi OnwurahMP for Newcastle Central
Talked about the parallels between technology and politics as driving forces for change, and the aims of the current Digital Government Review.

Mariana MazzucatoUniversity of Sussex
Argued that the image of the private sector as entrepreneurial and public sector as meddling and restrictive is an unhelpful myth and made the case for a bolder, entrepreneurial state.

Erin McKeanWordnik
Talked about the limitations of search as a model for accessing data and the need for discovery engines to find what you don’t know you want.

Blaise Aguera y ArcasGoogle
Described the history of machine intelligence and his predictions about what the future of machine intelligence might look like.

Carl LedbetterMicrosoft
Outlined the history and evolution of digital entertainment, and described the process that went into the design of the XBox One.

Jennifer GardyBC Centre for Disease Control
Described our progress in increasing our understanding of the human genome, and where it’s complexity lies.

Peter Gregson – Cellist
Gave a representation of the genome work that Jennifer had described. Instead of a data visualisation, it was a sonification. Using a cello.

Sean CarassoFalling Whistles
Told an inspiring story of how he came to learn about the terrible things happening in Congo, and how he went about trying to bring peace.

Conrad BodmanThe Barbican
Argued for recognition of the impact of digital tech on the arts, and described his projects to exhibit and showcase video games, animation, and digital effects.

Mark DearnleyHMRC
Described the challenges and need for technology in what HMRC do, and their digital ambition for the future.

Xavier De KestellerFoster + Partners
Talked about an amazing project to build a base on the moon, using autonomous robots with 3D printing heads to print a building out of moon dust.

Susan MulcahyImperial College London
Gave an energetic performance to describe the role of the red blood cell, and the science behind understanding brain injury.

Carlos UlloaHelloEnjoy
Showed what was possible using WebGL, bringing native 3D gaming to the browser without the need for plugins.

Jonathan O’HalloranQuantuMD
Described his work to create a mobile genetic-testing device, and the potential that real-time epidemiology from a mobile device could bring.

Blaise Aguera y ArcasGoogle
Talked about changes needed in society when more jobs are replaced by technology, and his observations about changes in gender dynamics.

Steve MouldBBC
Gave an entertaining talk about how he discovered, and tried to understand the science behind, the bead chain fountain.

Tom ScottUs Vs Th3m
Ended the conference with a fantastic performance showing what the impact of technology might be like in 2030.

Dale LaneIBM
And I did a Watson talk. I really didn’t want it to seem like a sales pitch, so I tried to put it in a bigger context of being a step forwards in changing how we use computers. I talked about why I work on Watson, what motivates and inspires me about it, and why I think what we’re doing is difficult but hopefully valuable. And I walked through a short demo to explain the value I see in where we are even now. Annoying technical issues (Keynote + clicker + multiple screens = fail) aside, it went okay. It was a lot to try and fit into 20 minutes, so I talked fast. :-)

Overall…

It was a fantastic event, and one I’d wholeheartedly recommend.

If you can get to a future Thinking Digital, you absolutely should.

It’s one of the most thought-provoking and interesting couple of days I’ve had in a long time.

.

Full-diclosure: As a speaker, I didn’t have to pay for a ticket to attend this event. My travel and accommodation costs were paid for by IBM.


Text analytics in BlueMix using UIMA

In this post, I want to explain how to create a text analytics application in BlueMix using UIMA, and share sample code to show how to get started.

First, some background if you’re unfamiliar with the jargon.

What is UIMA?

UIMA (Unstructured Information Management Architecture) is an Apache framework for building analytics applications for unstructured information and the OASIS standard for content analytics.

I’ve written about it before, having used it on a few projects when I was in ETS, and on other side projects since such as building a conversational interface to web pages.

It’s perhaps better known for providing the architecture for the question answering system IBM Watson.

What is BlueMix?

BlueMix is IBM’s new Platform-as-a-Service (PaaS) offering, built on top of Cloud Foundry to provide a cloud development platform.

It’s in open beta at the moment, so you can sign up and have a play.

I’ve never used BlueMix before, or Cloud Foundry for that matter, so this was a chance for me to write my first app for it.

A UIMA “Hello World” for BlueMix

I’ve written a small sample to show how UIMA and BlueMix can work together. It provides a REST API that you can submit text to, and get back a JSON response with some attributes found in the text (long words, capitalised words, and strings that look like email addresses).

The “analytics” that the app is doing is trivial at best, but this is just a Hello World. For now my aim isn’t to produce a useful analytics solution, but to walk through the configuration needed to define a UIMA analytics pipeline, wrap it in a REST API using Wink, and deploy it as a BlueMix application.

When I get a chance, I’ll write a follow-up post on making something more useful.

You can try out the sample on BlueMix as it’s deployed to bluemix.net

The source is on GitHub at github.com/dalelane/bluemixuima.

In the rest of this post, I’ll walk through some of the implementation details.

Runtimes and services

Creating an application in BlueMix is already well documented so I won’t reiterate those steps, other than to say that as Apache UIMA is a Java SDK and framework, I use the Liberty for Java runtime.

I’m not using any of the services in this simple sample.

Manifest

The app is bundled up in a war file, which is what we deploy. This is specified in manifest.yml.

Building

The war file is built by an ant task which has to include the UIMA jar in the classpath, and copy my UIMA descriptor XML files into the war.

I’m developing in eclipse, so I set up an ant builder to run the build, and configured the project to do it automatically.

I’m deploying from eclipse, too, using the Cloud Foundry plugins for eclipse.

XML descriptors

The type system is defined in an XML descriptor file and specifies the different annotations that can be created by this pipeline, and the attributes that they have.

Running JCasGen in eclipse on that descriptor generates Java classes representing those types.

The pipeline is also defined in XML descriptors: one overall aggregate descriptor which imports three primitive descriptors for each of the three annotators in my sample pipeline : one to find email addresses, one to find capitalised words and one to find long words.

Note that the imports in the aggregate descriptor need to be relative so that they keep working once you deploy to BlueMix.

These XML descriptor files are all added to the war file by being included in the build.xml with a fileset include.

Annotators

Each of the primitive descriptor files specifies the fully qualified class name for the Java implementation of the annotator.

There are three annotators in this sample. (XML files with names starting “primitiveAeDescriptor”).

Each one is implemented by a Java class that extends JCasAnnotator_ImplBase.

Each uses a regular expression to find things to annotate in the text. This isn’t intended to be an indication that this is how things should be done, just that it makes for a simple and stateless demonstration without any additional dependencies.

The simplest is the regex used to find capitalised words in WordCaseAnnotator and the most complex is the ridiculously painful one used to find email addresses in EmailAnnotator.

Note that the regexes are prepared in the annotator initializer, and reused for each new CAS to process, to improve performance.

UIMA pipeline

The UIMA pipeline is defined in a single Java class.

It finds the XML descriptor for the pipeline by looking in the location where BlueMix will unpack the war.

It creates a CAS pool to make it easier to handle multiple concurrent requests, and avoid the overhead of creating a CAS for every request.

Once the pipeline is initialised, it is ready to handle incoming analysis requests.

Once the CAS has passed through the pipeline, the annotations are immediately copied out of the CAS into a POJO, so that the CAS can be returned to the pool.

REST API

The war file deployed to BlueMix contains a web.xml which specifies the servlet that implements the REST API.

I’m using Wink to implement the API. The servlet definition in the web.xml specifies where to find the list of API endpoints and the URL where the API should be.

The list of API endpoints is a list of classes that Wink uses. There is only one API endpoint, so only one class listed.

The API implementation is a very thin wrapper around the Pipeline class.

Everything is defined using annotations, and Wink handles turning the response into a JSON payload.

That’s it

I think that’s pretty much it.

I’ve added a simple front-end webpage, with a script to submit API requests for people who don’t want to do it with something like curl.

It’s live at uimahelloworld.mybluemix.net.

Like I said, it’s very simple. The Java itself isn’t particularly complex. My reason for sharing it was to provide a boilerplate config for defining a UIMA analytics pipeline, wrapping it in a REST API, and deploying it to BlueMix.

Once you’ve got that working, you can do text analytics in BlueMix as complex as whatever you can dream up for your annotators.

When I get time, I’ll write a follow-up post sharing what that could look like.


Why am I still at IBM?

Ten years ago.

6 August 2003.

I was a recent University graduate, arriving at IBM’s R&D site in Hursley for the first time. I remember arriving in Reception.


Reception – the view that greeted me when I arrived

Ten years.

It was a Wednesday.

I’m still at the same company. I’m still at the same site. I still do the same drive to work, more or less.

For a *decade*.

How did that happen?

It was never The Plan. The Plan (as cynical as it sounds in hindsight) was that I’d stay for two or three years. I figured that would be long enough to get experience, and then I’d leave to work at a small nimble start-up which was where all the “cool” work was.

The Plan never happened. A few years passed, and then another few… I kept saying that I’d leave “later” and before I knew it a ten year milestone has kind of snuck up on me.

I think I’m more surprised than anyone. I’ve never been at any place this long. I was at Uni for five years. The longest I was at any school was four years.

It’s a serious commitment, and one I never realised that I had made. I’ve not even been married for as long as I’ve been with IBM.

So why? Why am I still here?

I live here.

It’s been varied

I’ve spent ten years working for the same company, but I’ve had several jobs in this time.

I’ve been a software developer. I’ve been a test engineer. I’ve been a service engineer, fixing problems with customer systems. I’ve worked as a consultant, advising clients about technology through presentations and running workshops. I’ve done services work building prototypes and first-of-a-kind pilot systems for clients.

I’ve written code to run on tiny in-car embedded systems and apps that ran on mobile phones. I’ve worked as a System z Mainframe developer. I’ve written front-end UI code, and I’ve written heavy-duty server jobs that took hours to run (even when they weren’t supposed to).


IBM Hursley

It’s still challenging

I’ve worked on middleware technology, getting some of the biggest computer systems in the world to communicate with each other, reliably, securely and at scale. I’ve used analytics to get insight from massive amounts of data. I’ve worked on large-scale fingerprint and voiceprint systems. I’ve used natural language processing to build systems that attempt to interpret unstructured text. I’ve used machine learning to create systems that can be trained to perform work.

I’m still learning new stuff and still regularly have to figure out how to do stuff that I have no idea how to at the start.


Some views of the grounds around the office

I get to do more than just a “day job”

I do random stuff outside the day job. I’ve helped organise week long schools events to teach kids about science and technology. I’ve mentored teams of University students on summer-long residential innovation projects. I’ve prepared and delivered training courses to school kids, school teachers and charity leaders. I’ve written an academic paper and presented it at a peer-reviewed research conference. And lots more.

I’m a developer, but that doesn’t mean I’ve spent ten years churning out code 40 hours a week. There’s always something new and different.


Hursley House – where I normally work when I have customers visiting

I work on stuff that matters

Tim O’Reilly has been talking for years about the importance of working on stuff that matters.

“Work on something that matters to you more than money”

If you’ve not heard any of his talks around this, I’d recommend having a look. There are lots of examples of his slides, talks, blog posts and interviews around.

I can’t do his message justice here, but I just want to say that he describes a big part of how I feel very well. I want to work on stuff that I can be proud of. Not just technically proud of, although that’s important too. But the pride of doing something that will make a difference.

Working for a massive company gives me chances to do that. I’ve worked on projects for governments, and police forces, and Universities. I’ve done work that I can be proud of.

For the last couple of years, I’ve been working on Watson. It’s a very cool collection of technologies, and watching the demo of it competing on a US game show has a geeky thrill that doesn’t get old. But that’s not the most exciting bit. Watson could be a turning point. This could change how we do computing. If you look at what we’re trying to do with Watson in medicine, we’re trying to transform how we deliver healthcare. This stuff matters. It’s exciting to be a part of.


These are the views that surround the site

I like the lifestyle

Hursley is a campus-style site. It’s miles from the nearest town, and surrounded by fields and farms. It’s quiet and has loads of green open space.

My commute is a ten minute drive through a village and fields.

I don’t have to wear a suit, and I don’t stand out coming to work in a hoodie and combat trousers. Flexitime has been the norm for most of my ten years, and I am free to plan a work day that suits me. When I need to be out of the office by 3pm to get the kids from school, I can.

My kids are at a school half-way between home and the office, so I can do the school run on the way to work. As the school is only five minutes from work, I often nip out to see them do something in an assembly, or have lunch with them.

This is a nice aspect of the school – that parents are welcome to join their kids for lunch, and have a school dinner with them and their friends in the school canteen. But still… it’s pretty cool, and if I didn’t work just up the road from them, I wouldn’t be able to do it.

Once a month, I bring them to work in the morning before school starts for a cooked breakfast in the Clubhouse with the rest of my team.

All of this and a lot more tiny aspects like it add up to a lifestyle that I like.

More train tickets
Some of my train tickets from the last few years

I get to see the world

I enjoy travelling. I love seeing new places.

But I’d hate a job where I lived out of a suitcase and never saw the kids.

I’ve managed to find a nice balance. I travel, but usually on short trips and not too often.

In 2006, I worked at IBM’s La Gaude site near Nice. In 2007, Singapore, Malaysia, Philippines and Paris. In 2008, I worked in Copenhagen, Paris and Hamburg. In 2009, I worked in Munich many times, and Rotterdam. In 2010, Stockholm. In 2011, Tel Aviv and Haifa in Israel, Austin in Texas, Paris and Berlin. Last year, I worked in Zurich and Littleton, Massachusetts.

This year, I’ve been in Rio de Janeiro and Littleton again, and it looks like I’ll be in Lisbon in December.

Plus working around the UK. It’s less glamorous, but it’s still interesting to go to new places. I’ve worked in loads of places, like Edinburgh, York, Swansea, Malvern, Warwick, Portsmouth, Cheshire, Northampton, Guildford… I occasionally have to work in London, although I tend to moan about it. And I spent a few months working in Farnborough. I think I moaned about that, too. :-)

Travelling is a great opportunity. I couldn’t afford to have been to all the places that IBM has sent me if I had to pay for it myself.

The officemy deskCarnage
My office today (left), compared with some of the other desks I’ve had around Hursley

The pay is amazing!

Hahahahaha… no.

See above.


Grace at my desk at a family fun day at work in 2008

Will I be here for another ten years?

I’m trying to explain why I’m happy and enjoying what I do. I’m not saying I couldn’t get exactly the same or better somewhere else. Because I don’t know. Other than a year I spent as an intern at Motorola I’ve never worked anywhere else. For all I know, the grass might be greener somewhere.

Will I still be here in another ten years? I dunno… I do worry if that’s unambitious. I wonder if I should try somewhere else. I wonder if only ever working for one company is giving me an institutionalised and insular view of the world.

I keep getting emails from LinkedIn about all the people I know who have new jobs. There are a bunch of people I used to work with at IBM who have not only left to work at other companies, but have since left those companies and gone on to something even newer. While I’m still here.

Am I destined to be one of those IBMers who works at Hursley forever? That’s a scary thought.

For now, I’m enjoying what I do, so that’s good enough for me.

Happy 10th anniversary to me.