Watson Rock, Paper, Scissors

A simple hands-on activity to let kids train a machine learning classifier to be able to play Rock, Paper, Scissors.

Screen Shot 2016-06-26 at 13.59.20

I’ve written and spoken before that I think we should do more to introduce children to the idea of machine learning. And I’ve tried introducing my two kids to it, such as by making a Code Club-style game with them: we built a system to play Guess Who, that they trained both to understand what you say and to recognise the characteristics of faces from photos.

This weekend, we tried out another idea – Rock, Paper, Scissors from a web app, using the web cam to see your moves, and training a system to recognise your hand signs.

DSC06146

In many ways, this was simpler than the Guess Who project, and one I would’ve tried before if we’d thought of it! I’ve made getting the system to choose it’s next move very simple, as it just chooses one of the three options at random.

The machine learning element comes from the fact that I got them to train a custom image classifier to recognise what a ‘rock’, ‘paper’ and ‘scissors’ hand sign looks like.

DSC06153

Last night I hacked together a quick single-page training web app for them to use. Unlike with the Guess Who game, where we worked on it together to come up with the project, this time I made it myself to see how they’d get on with using it. (I was thinking that if it went well, I’d try using it with one of my school groups).

They got off to a good start… although I hadn’t counted on how much time Grace would spend checking out her hair once she saw the web cam video. 🙂


Video of the kids getting started

I’ve got it so that you can take photos from the web cam by clicking on a button, and put the photo into one of three training groups – one each for rock, paper and scissors.

DSC06150

My hope was that the three hand signs – a fist, a flat palm, and two fingers – are distinct enough that an image classifier could quickly start to distinguish between them.

Although I should probably add some overlay to the live video to suggest where to put your hand, how close to the camera, etc. as their initial attempts were fairly inconsistent.


Video of their first attempt at training

As before, I’m using one of the Watson developer APIs available in Bluemix, called Visual Recognition.

To train it, you just need to upload zip files, where each zip file contains examples of photos of something you want to recognise.

In this case, we want to upload three zip files – one of photos of “rock”, one zip file of “paper” photos, one zip file of photos of “scissors”. And you can do all of that in a single HTTP POST, so the code behind the training app is very simple.

DSC06152

That said, I should still share the code.

It was hacked together in an evening, so it’s a complete mess. But I’ll try and find some spare time this week to tidy it up, and put the code somewhere in case it’s helpful. Some of the code for driving the webcam, zipping up the photos, and uploading them, might be useful to someone.


The first test

To test it, I made a simple game. I’ve got it keeping score to see how many moves you win against the random choices the game makes. Although Grace seemed more interested in counting the number of times the game correctly identified our moves. It wasn’t perfect!

(The “Watson’s move” images used in the game bit were made by Faith. She took photos of her own hand using an iPad, and did the weird grid background effect using PopAGraph. It wasn’t quite what I expected, but I think it looks kinda neat.)

Screen Shot 2016-06-26 at 13.57.16

As with the Guess Who activity, what most interested me was the girls’ reactions to the behaviour of the ML system. I was surprised that even after last time, their initial assumption was still that they could take one photo of each, and that would be enough. (Another reason why it’s worth doing a few of these activities to reinforce it).

But with a little nudging, they quickly saw how that the more examples they gave, the better the system performed.

Other than that, a lot of the lessons they learned were reminders of what we talked about before about what it’s like doing supervised learning projects.


The second attempt… another test after more training this time

One thing I was impressed with was when they thought that the training would be more effective if they used an actual rock, actual scissors, and a sheet of paper. Doing that made the accuracy a hundred times better. That makes sense, as they are much more distinct than all photos of hands, so that was a neat idea!

And that’s pretty much it. My second experiment at getting the kids to play with machine learning seemed to work pretty well.

I’ve put the app I made for them up on Bluemix so you’re welcome to give it a try it you like. It’s at https://watson-rock-paper-scissors.eu-gb.mybluemix.net.

You’ll have to get your own Visual Recognition API key to use it, though, but that supports a free trial, so hopefully that won’t put you off!


Third time lucky? Another test


Update 1:

A couple of friends pointed me at another recent IBM rock, paper, scissors project when I mentioned planning to do this.

It looks like a great project and is well worth a look – they’ve gone into using Apache Spark to create a system able look for patterns in how people play rock-paper-scissors and learn strategies to win the game.

More recently, they’ve even gotten a NAO robot to play the game for them!

I wasn’t aware of this work before, but decided to go ahead with our project anyway. Partly because my focus was different for this, but mostly because I think that the Agonies of Parallel Creation should never stop us from creating and sharing stuff. There’s no new idea under the sun, so if we wait for an idea that no-one has ever got close to, we’d stop creating anything.

That said, it’s a particularly surreal coincidence in this case, as this is not only an IBM project, but the author of that post is the great David Taieb who I used to work for when he worked on Watson! Small world.


Update 2:

Yes, I know you can see my API key in the video. But don’t worry. Bluemix makes it easy enough to revoke credentials so I’ve deleted that API key. Doing that was much quicker than learning enough video editing to be able to mask it out. 🙂

DSC06149


An introduction to machine learning with Guess Who

I tried introducing my two kids to machine learning by helping them make a game this week.

In this post, I’ll try and explain why, how we did it, and how it went. And if you make it all the way to an end, I’ve got some videos and a link to a demo to show you what we made.

Why

I think we need to introduce the basic concept of machine learning to children.

I think the current approach to introducing coding using things like Scratch aren’t enough. This isn’t to say Scratch isn’t great (I’ve been running a Code Club every week for the last couple of years, delivered almost entirely using Scratch, so I’d be the last person to say it isn’t a fantastic tool). It lets you snap together blocks representing actions to teach the programming mindset of getting a computer to do something by you breaking the task down into a series of steps.

I think we need to add to this with something that introduces the model of machine learning – getting a computer to do something by training it with examples of doing that task.

I’ve been saying this for a while – I gave a talk about it at an education conference last year, I’ve written about it here before, and it was the theme of a lecture I gave at a science society in London last month.

This week is half-term and I have the week off work, so I thought I’d finally spend a bit of time trying it out by experimenting on my own two daughters (Faith and Grace, who are aged 7 and 11).

In Code Club, I mostly try to introduce programming concepts by helping the kids to create games. Sticking with what seems to work, I’ve helped them to make a game by training an ML system how to play it.

How

We’ve been making a sort of Guess Who? game – a game that I’ve played with the kids for years and that they know inside out.

Guess-Who-Closeup.jpg

The idea is that the computer shows a bunch of faces and chooses one of them at random. You have to work out which one it’s chosen by asking Yes/No questions about their appearance.

I introduced this to Grace and Faith mostly using analogies and anthropomorphising.

“We need to teach the computer how to play the game instead of telling it how to”.

“We’ll do it the same way that you teach a small child to do something, not by telling it the series of steps to take but by showing it examples of how it should be done, over and over, until it learns how to do it”.

And so on.

I went with a couple of opportunities for trying out machine learning approaches.

Understanding the question

As a player, we let you ask questions in natural language. You’ve just got a text box to type in anything you want to ask. In practice though, there’s a reasonably predictable set of questions you might ask – it’s a very constrained domain. So it lends itself well to using a text classifier.

I got the kids to make a list of all the types of questions they ask in a Guess Who game. These were the classes we’d train the system with (e.g. “has brown hair” or “is wearing a hat”). Then for each class, they had to come up with as many ways as they could think of to ask that question.

the kids using Watson NLC

We used this to train IBM Watson Natural Language Classifier.

I chose it mostly because I’m familiar with it, it’s a hosted readily available service so there was nothing to install or configure, it has a simple web interface for training that the kids could use to enter the classes and texts they thought of, and for this sort of kicking-the-tyres-with-minimal-usage it’s free.

With a little training, they had a trained classifier that would be able to take a question from the player and map it to one of the classes they’d come up with for the questions they ask when they play Guess Who.

Answering the question

The next step was to train the system to be able to answer the questions. We needed it to be able to recognise visual characteristics in a photo. Again, this is a constrained domain, although less so than for classifying questions. But still, this was basically a classifying task, albeit with the extra challenge of using images.

We looked for a collection of face images that we could use, both for training, and for the game itself. We initially started trying out Labeled Wikipedia Faces, but ultimately settled on using Labeled Faces in the Wild – a set of over 13,000 photos of faces.

Grace, preparing the training for Visual Recognition

I got the kids to group examples of the faces based on the characteristics they’d come up with in their list of text classes. We just used the built-in OSX Finder to do this.

One folder open on the left with all the unsorted faces in, and a bunch of folders open to the right – e.g. “short-hair”, “long-hair”, “bald”, “hat”. They scrolled through the unsorted pictures, and trained by dragging the photos across into the relevant folder.

We used an IBM Watson API with this, too : IBM Watson Visual Recognition

Again, I chose it because I’m familiar with it, and it’s a hosted service I didn’t need to install or run, and while it’s in beta it’s all free to use.

Putting it all together

While they did that, I hacked together up a quick game to drive it. They were in charge of training, I wrote a simple REST API to make the requests to the Watson APIs, and a UI to let the player interact with it.

We had our game!

How it went

I asked them what they’d learned from doing it, and they came up with things like:

“You need a lot of training data”

When I explained how we’d train the visual recognition classifiers to recognise characteristics in photos, Faith said “We could take photos of each other and use those!”. I said that we’d need more than that, so she said “I could take photos of my friends as well!”.

But we needed so many more than that. We started by looking to Wikipedia (always a good starting point for getting data!), before eventually settling on the LFW set.

Getting them to help me find the training data gave them an idea of the sort of scale you need for these sorts of projects, as well as giving them an insight into ideas like gathering existing data rather than trying to manually create it.

“The more training you give it, the better it gets”

We did this project off-and-on throughout half-term week rather than all in one sitting. And we re-ran the training after each go. It was really clear the way that the results improved the more that we did.

I’m a big fan of learning by doing, and I think that seeing that for themselves was more effective than if I’d just told them. Trying it out with virtually no training, and seeing that it’s rubbish. Then trying it out with a bit of training, and see it start to get better. Then trying it again after a few days and see it really improving – there was a definite light bulb moment.

“Training is fun at first, but can get a bit boring after a while”

This was more an issue with the visual recognition training rather than the NLC text classifier training. Even after the first afternoon they’d already come up with a classifier that was doing a decent job of most questions. But grouping the photos into the training sets took much longer. And to be honest, it really needs more training than they did if you wanted to turn this into a proper demo or game – I let them do enough to get the idea, but stopped them before it became a miserable chore!

But this was a positive thing. As a takeaway, that’s a really valid and valuable insight. I’ve seen commercial ML projects fail to recognise this at the outset! Training ML systems is a repetitive, manual and very time-consuming task that does get boring for the people involved.

We talked about ways that people try and get around that, like trying to turn it into a game. Grace thought that if we did this as a school activity, with all her class helping out with the training, then split between 30 kids, it wouldn’t take as long.

All of these points – the amount of time and work needed to train, approaches like getting a large number of people to train, etc. – these are all great lessons about the practicalities of using machine learning.

“It’s a bit like magic”

If you had to describe the steps involved in working out if a photo included a face that has a moustache, that would be pretty complicated. But dividing 100 photos into two groups – faces with a moustache, and faces without a moustache – that’s easy.

There were a bunch of things they talked about here that was really positive – that doing this is easy, and you don’t need you to understand the deep detail behind how it works. With Faith (my youngest) we didn’t go into too much detail about how the training works, but Grace and I did find some videos on YouTube that explained the principles behind deep learning which were interesting.

More generally, what I wanted them to learn was that computers can spot patterns in data, without you needing to be able to specify rules that they should follow. And I think that’s a more effective lesson as something they saw and did for themselves rather than just reading or hearing about it.

The result


https://youtu.be/oguzXlRT4NQ

I’ve put what we came up with on Bluemix at http://guesswho.eu-gb.mybluemix.net

I’ve also put some more videos below to show us trying it out.

As I explained above, it’s not really “finished” – I think it needs more training, and the UI was hacked together in a hurry. But even as a quick half-term project, I think it already shows a glimmer of what is possible.

More importantly, it shows that you can give kids a hands-on introduction to machine learning.


https://youtu.be/1AkSVBzE5io


https://youtu.be/6ay7MqhW53g


https://youtu.be/Z3UReJzzY4Q


https://youtu.be/GNcz_D62_0w