Cracking the AreYouAHuman Captcha

I’ve previously written about the pros and cons of the AreYouHuman captcha system. For those that don’t know, it’s plan is to free the world of those terribly hard to read text based captchas and instead have users play a short little game to prove they’re not a bot.

I was able to crack the AreYouAHuman captcha games (“PlayThru”) by using python and SimpleCV (an open source python framework to several powerful open source computer vision libraries).

In the video below, you’ll see the progress of the software so far. At the moment it only knows how to play one type of game, although modifying it to play the others will require only minor work.

As you’ll see in the video, the cracker bot isn’t perfect yet, it will sometimes try the same object several times, even though it’s incorrect. This is something that will be improved. Overall, it performs pretty well and can crack most of the games within just a few seconds. The game actually runs pretty slow and I’ve had to add a few delays so that it can catch up and then realise it’s been beaten. Apparently they have put code in place that tries to check for bots vs humans, that might also contribute to the delays that are required.

 

Update: Waiter, no shoes with my pancakes!

In my first demo video, the bot was quite insistant on adding the shoe to the pancakes, this was obviously a problem. Another thing, which was pointed out by the CEO of AreYouAHuman; simply beating the game doesn’t mean you’re going to pass the test. Upon beating the game you need to click another button which then tells you whether you were classified as a human or a bot.

I was ready to move on from this project but wanted to at least test whether or not my software was being flagged as a human or bot. I tested my original code and sure enough I was flagged as a bot. Doh! I couldn’t walk away knowing that.

Rather than playing a few games reasonably well, I decided to focus on playing the pancake game really well. After all, you can select what game you’re going to play by simply refreshing the captcha, so just cracking one game is enough to get through. My bot now plays the pancake game without any mistakes and passes as a human.

 

 

Why did the bot fail the first two tests?

I had preloaded the pancake game into a few tabs, about 30 minutes before recording the video. It seems that there is a time limit in which you must complete the game. When fresh games were loaded, the bot cracked them without issue.

But you’ve only cracked one game, they have lots of games..!

In it’s current state, the bot can only crack one game. It does this perfectly, time and time again. With minor modifications it can be taught to play the other games. However, there’s no strict need to teach the bot how to play every game because there’s a refresh button within the captcha iframe which lets you select a different game. The bot simply needs to keep hitting that button until the game it knows how to play comes up.

Why is the mouse movement so odd?

Winning the game is not enough to pass the AreYouAHuman captcha test. It also takes into account timing and how natural your mouse movement seems. I added some delays and mouse movements that try to mimic a human player.

What about sniffing the packets?

Ordinarily  the first thing i’d do when trying to crack a captcha is sniff the packets to see what’s going back and forth to the server. Sometimes it’s just a matter of playing with those packets to get passed as a human.

In this case, I didn’t look at the packets because I wanted to beat the captcha the same way a human would, by playing and passing the game.

However, for those that are curious, here’s a quick breakdown of what’s going on in the background:

  • When the game starts, the client (our browser) tells the server (areyouahuman web service) to log our start time , our IP, session details and game ID.
  • When a correct item is dragged to the correct place, the client tells the server that an item has been placed correctly and passes details of the mouse movement.
  • When the game has been solved, the client tells the server to log the end time, our IP, session details and game ID.
  • A final POST is then made to the server with our game session details, at which point the server looks at the start time, end time, system details (operating system, browser, etc.) and observations like object placements and mouse movements. It then gives a result of whether it thinks you’re a human or bot.

As a side note, it seems that far too much information is being stored in a cookie that should really be held server side with the session data.



12 Responses to “Cracking the AreYouAHuman Captcha”

  1. Justin Case says:

    It’s ok to make it for fun, as long as you don’t release it publicly or sell it.
    If you did I would have no respect for you for helping spammers.
    Just sayin’.

    • admin says:

      Justin, I don’t really have any intention of releasing the software. It’s just a proof of concept kinda thing. I’ve been wanting to play with computer vision for awhile but couldn’t think of an interesting project. I spotted AreYouAHuman on a news website and figured it would be a great first project!

  2. Tyler says:

    Thanks for taking an interest in our CAPTCHA Alternative. I’m one of the founders of Are You a Human.You’re testing with out demo game, not the real thing and the end screen you’re showing isn’t an indicator of success or failure. You’d have to submit to our webservice and request a score and most of these would have failed.

    We are always looking to improve our security. If you’d like to collaborate, reach out and let me know.

    We’re trying to make CAPTCHA usable again, it’s gotten out of hand.

    • admin says:

      Hi Tyler, thanks for your input and for pointing out my mistake. I hadn’t realised that there was another step to the process. I tested and indeed saw that simply completing the game is not enough. I’ve since improved the bot and it now passes as a human on the final check.

  3. RoadieRich says:

    It’ll be interesting to see how fast you can get it: I don’t really think you can count a captcha as broken until your code is faster than the Mk 1 eyeball.

    Even if it can only slow bots down, it’ll still have some effect on the amount of spam out there.

  4. Arthur Dent says:

    The obvious problem with these captchas is that if you get a part wrong, they don’t make you start over. If you had six things, and you had to put three of them on the pancake, but putting the shoe on in between reset the captcha to a different one, this would take much much longer to crack. Honestly though, these captchas seem silly.

  5. laton says:

    Can this kind of UI manipulation with python be done in linux distributions?

  6. bob says:

    This product was outsourced on elance—what did you expect?? Totally insecure and a liability for publishers

    • admin says:

      That’s interesting, I wasn’t aware that they’d outsourced the work. Although It’s not clear if they outsourced the whole thing or just parts of it. Their Elance account shows that they’ve spent just under $4000. I’d imagine it would have cost significantly more than that to get the system where it is today.

  7. Xamox says:

    Hey, I’m one of the core SimpleCV developers. Very cool project. I had never thought about using SimpleCV for this type of purpose, but high five for thinking outside of the box with it.

    • admin says:

      Thanks Xamox! I’ll be using it a lot more when you guys get the auto installer working for the latest versions of Mac OS X. I tried following the manual install instructions but gave up after a few hours of getting nowhere!

Leave a Reply