IBM Watson vs. Google Search

Today marked another step towards Judgement Day as a computer, built and programmed by IBM, competed against the two best human Jeopardy players of all-time, Ken Jennings and Brad Rutter.

After watching the episode I wondered if Google Search would be competitive in this forum. So I tried typing a few of the answers, verbatim, into Google Search (Watson gets the answers delivered via a text file which must be parsed so this seemed equivalent) to see what it returned for the first result. The first one I tried was “bang bang” his “silver hammer came down upon her head”. The first result returned (in 0.32 seconds) was:

The Beatles Lyrics – Maxwell’s Silver Hammer
… on the door Bang, bang Maxwell’s silver hammer came down upon her head bang, … as the words are leaving his lips A noise comes from behind Bang, Bang …

Granted, to compete on Jeopardy, the contestant must phrase his/her response as a question which Google Search doesn’t do, but clearly, it would be trivial to provide “What is Maxwell’s Silver Hammer” just as Watson did. The wondernet is abuzz with comments about how the answer should really be “Who is Maxwell” but Jeopardy did not ding Watson.

The next question I tried was his victims include charity burbage, mad eye moody & severus sname; he’d be easier to catch if you’d just name him!. Watson did not answer this question as none of the answers crossed the “buzz threshold” (Harry Potter 37%, Voldemort 20%, Albus Dumbledore 8%). It doesn’t appear that Google Search would have been able to answer this question either based on search results 2-10. It’s humorous that the first result in Google Search was someone’s blog post about how Watson could not answer this question and Brad responded correctly with “Who is Voldemort?”

Next up was a piece of wood from a tree, or to puncture with something pointed. Watson correctly answered “what is a stick?”. The Google Search results are a bit odd as results 2-8 are all definitions of the word stick, but result 1 is:

stuck (pierce, penetrate, puncture) – Memidex dictionary/thesaurus
Jan 20, 2011 … Definition: to pierce, penetrate, or puncture with something pointed … A piece of wood, such as a tree branch, that is used for fuel, . …

Clearly any blending algorithm would have likely overridden stuck with stick, and even if it didn’t I imagine the past tense of stick would have passed as an answer.

Both Watson and Google Search had trouble with the next question, from the latin for “end”, this is where trains can also originate. Watson replied, “what is finis?”. Google Search replies with…buttocks 🙂 It’s worth noting that Watson has additional information such as the category (which I have not been providing to Google Search) and the dollar value of the question which is some indication of difficulty (though in this case pre-pending the category in the Google Search query doesn’t help).

Back to great success as both Watson and Google Search easily come up with the question for, “so i sing a song of love” this woman, also the name of john’s mother. The first Google Result:

Here’s an example of a clear Watson win and Google Search fail, a 1976 entrant in the “modern” this was kicked out for wiring his epee to score points without touching his foe. The correct response is, “what is the pentathlon?” The Google Search results are miserable and no shortening of the query brought up any relevant results on page one. Of course if I “translated” the answer to “1976 cheating epee” or even “1976 cheating olympics” (the category was olympic oddities) Google Search finds the correct answer in the first result.

To be fair, here is an example of a clear Google Search win and Watson fail, klaus barbie is sentenced to life in prison & dna is first used to convict a criminal. Watson was completely stumped (2002 11%, 1987 7% Lyon 3%) whereas Google Search’s first two results are:

What are some famous things about the 1980’s? – Yahoo! Answers
May 31, 2009 … DNA First Used to Convict Criminals Klaus Barbie, the Nazi Butcher of Lyons, Sentenced to Life in Prison New York Stock Exchange Suffers … › … › All Categories › Arts & Humanities › History

1980s Timeline – History Timeline of the 1980s
DNA First Used to Convict Criminals; Klaus Barbie, the Nazi Butcher of Lyons, Sentenced to Life in Prison; New York Stock Exchange Suffers Huge Drop on …

What’s impressive (or lucky) is that I didn’t tell Google Search the category was “Name the Decade”, which is information Watson had and still couldn’t come up with anything close. It’s also worth noting that Ken Jennings began altering his strategy at this point. He had been doing poorly up to this point and realized he needed to buzz in before actually knowing the answer since he would have several more seconds to come up with it while the question was being read and another second or two once called upon. He did it successfully in this case and a few other question before the round was over.

Full disclosure, I currently work for Google…in search infrastructure/quality so the next statement may seem biased, but in all honesty, what is so impressive about what IBM has done? They’ve basically programmed a Jeopardy response mode on top of Google Search. Note it’s only the response portion since Google Search can already perform all the parsing and “understanding” of the answer provided. If you argue Watson doesn’t have access to as much information as is indexed by Google Search, you’re correct, but it still has hundreds of terrabytes which is more than enough to contain all the non-longtail information out there. After all Jeopardy isn’t going to provide answers to which the question is about your pet hamster.

If you argue Watson doesn’t have as much computing power as Google Search, again you’d be correct, but Google Search is responding to thousands of times more queries every second, in many more languages. So after day 1 I’m nonplussed, hasn’t Google been doing for years, on a larger scale, what Watson is just now doing on TV? In case you missed it, here are some YouTube videos of the match:

