Archive for Projects

31 Dec 2011

Breaking Aljazeera’s CAPTCHA

No Comments Projects, Web

I was on Aljazeera Arabic’s website the other day and, as I was voting on a poll, was presented the following screen:

The CAPTCHA in the screen above immediately caught my attention. The distortions in it seemed very simple, the text was not warped in any form and no overlap between characters.

The following is a URL for one of the CAPTCHAs:

http://www.aljazeera.net/Portal/KServices/Controles/SecureCAPTCHA/
GenerateImage.aspx?Code=EANmyyXghpajFhOX6rCRKQ==&Length=4

Opening the URL above and refreshing the page a few times gives the following CAPTCHAs:







The dashed grey lines are randomized, while the letters in the CAPTCHAs above are static. The letters are encoded in the Code parameter in the URL. Notice that there are two forms for each character; a straight form and another that is slightly rotated.

Aljazeera’s CAPTCHA can easily be broken by doing the following:

  1. Removing the dashed grey lines
  2. Finding the characters in the image
  3. Separating the characters in the image
  4. Classifying each character

I’ll be using Octave/Matlab for the above tasks and will be explaining my algorithm using the following CAPTCHA as an example.

Read more

13 Aug 2011

El-Tetris in HTML5. See it in action!

No Comments Projects, Web

Following up on my previous post on the El-Tetris algorithm, a Tetris player that clears 16 million rows on average per Tetris game, I thought I would provide an implementation, rather than just a description of the algorithm.

This algorithm is implemented fully in Javascript and the rendering is done in HTML5 canvas. The rendering is purely for cosmetic reasons (so you can actually see how the game is progressing). If you’re only interested in the final score, you can choose to speed up the game by enabling “Hardcore Mode”. In that mode, rendering the board will be disabled and the algorithm will run continuously in the background. You can also change the size of the board; the smaller the board, the shorter the game.

Full source code can be found here.

Note: For faster execution, use Google Chrome.

04 Aug 2011

inFormed – A LinkedIn Hackday Project

No Comments Projects

Last Friday I participated in the LinkedIn Intern Hackday event that was hosted at LinkedIn’s headquarters in Mountain View. I joined my classmates from Waterloo Michael Truong, Kenneth Ho and Sumit Pasupalak.

We started a project dubbed “inFormed”. The aim of the project is to raise awareness on global issues around the world. Currently, it’s a Firefox plugin. As you browse the web, it will analyze the content of the page you are browsing and, based on that content, will show a fact, or a statistic, that is both relevant to the content of the page and related to a global issue. Along with that, it will provide a link to a charity where you can donate and/or get involved.

For example, if you are buying a book online or browsing an educational site, you would see, at the bottom right-hand corner, something like this:

Have a look at the screenshots below for some more examples. Take a close look at the fact displayed at the bottom right-hand corner and notice how it’s related to the content of the page.

To summarize, the goals behind inFormed are the following:

  • Help you stay informed on global issues around the world.
  • Facilitate how you can be involved by providing links to related charities.
  • Provide a seamless and an uninstrusive user experience.

Behind the scenes, inFormed sends the URL of your current page to the server where we fetch the content of that page, extract the text, and run it through a Naive Bayes classifier to select what is likely to be the most relevant fact or statistic on that page, and feed that back to the browser.

This event is the first hackathon we ever participate in, and we are well proud to have made it to the final round! We didn’t win the event, but were extremely impressed at the quality of the projects that people presented.

We had some votes on twitter as well:

 

inFormed will need a little more work to be ready to publish. Should we invest the time in doing so? Would you use it? Let us know!