{"id":596,"date":"2012-01-09T17:13:42","date_gmt":"2012-01-10T01:13:42","guid":{"rendered":"http:\/\/www.madpickles.org\/rokjoo\/?p=596"},"modified":"2012-01-10T17:23:44","modified_gmt":"2012-01-11T01:23:44","slug":"ai-class","status":"publish","type":"post","link":"https:\/\/www.madpickles.org\/rokjoo\/2012\/01\/09\/ai-class\/","title":{"rendered":"AI Class"},"content":{"rendered":"<p>Towards the end of last year I signed up for the &#8220;basic&#8221; track of the Stanford Engineering <a href=\"http:\/\/www.ai-class.com\/\" target=\"_blank\">Introduction To Artificial Intelligence<\/a> class, mainly out of curiosity, but also because the instructors (<a href=\"http:\/\/www.stanford.edu\/~thrun\/\" target=\"_blank\">Sebastian Thrun<\/a> and <a href=\"http:\/\/www.norvig.com\/\" target=\"_blank\">Peter Norvig<\/a>) and I share the same employer. I intended to follow along at a leisurely pace, unsure if I would be able to commit the time necessary to watch all the lectures and go through the homework problems. However, during the first week I found the homework assignments unavailable to &#8220;basic&#8221; track participants until after they were due (this was later changed) so I switched to the &#8220;advanced&#8221; track and made a concerned effort to adhere to the class schedule.<\/p>\n<p>It&#8217;s similar in style to the <a href=\"http:\/\/www.khanacademy.org\/\" target=\"_blank\">Khan Academy<\/a>, and it definitely took my understanding of <a href=\"http:\/\/en.wikipedia.org\/wiki\/Bayes'_rule\" target=\"_blank\" rel=\"nofollow\">Bayes&#8217; Rule<\/a>, <a href=\"http:\/\/en.wikipedia.org\/wiki\/Hidden_Markov_model\" target=\"_blank\" rel=\"nofollow\">Hidden Markov Models<\/a>, etc. to another level. Sure there are tons of books and tutorials out there, but I feel there&#8217;s a limit to what one can learn without practice. One form of practice is to create a toy programming project, the main downside of which is never knowing if you&#8217;re doing it &#8220;right&#8221;. Contests ala <a href=\"http:\/\/www.netflixprize.com\/\" target=\"_blank\" rel=\"none\">Netflix Prize<\/a> are another which at least provides better\/worse feedback. The main benefit of online courses though is one can practice solving problems and know whether or not an answer is correct.<\/p>\n<p>I didn&#8217;t participate in any of the discussions, though I did find the group website useful for clarification on a few of the problem statements. Even so, there were still a couple of questions throughout the homework and exams which I misinterpreted, leading to an incorrect answer; however, the largest cause of incorrect answers seemed to be my inability to punch numbers into a calculator correctly&#8230;I guess I haven&#8217;t gotten any better at that since I was last in school :S<\/p>\n<p>An unexpected realization was how much I value flexibility in scheduling. There were definitely a few weeks I had to scramble in order to make it through the videos and homework before the deadline. Funny how the addition of one more deadline per week caused me so much grief. I was interested in the material, but some of it took quite a bit of time to get through (especially once I slowed down after rushing through a couple of homework assignments and doing poorly).<\/p>\n<p>While one could certainly write programs to solve some of the quiz\/homework\/exam problems, all of them were solvable with pen and paper. The lack of problems requiring writing code was one of the top complaints. They held an AI competition; I didn&#8217;t participate as I feared that would take up too much of my time. However, towards the end of the class, the instructors provided an optional quiz which required writing some code: <a href=\"https:\/\/www.ai-class.com\/course\/video\/quizquestion\/296\" target=\"_blank\" rel=\"none\">Optional Natural Language Processing Problems<\/a>. This seemed like a much less time consuming set of questions and I would also get instant feedback.<\/p>\n<p>The first problem was easily solved with brute force. My basic idea was to try every possible rotation and see which one yielded the most words which appear in an English dictionary.<\/p>\n<pre>\r\n#!\/usr\/bin\/python\r\n\r\nimport sys\r\nimport string\r\n\r\n# Problem text, all lowercase, no punctuation.\r\nSENTENCE = 'esp qtcde nzyqpcpynp zy esp ezatn zq lcetqtntlw tyepwwtrpynp hld spwo le olcexzfes nzwwprp ty estd jplc'\r\n\r\n# Parse the system's English dictionary into a set.\r\ndef WordsToSet():\r\n  word_set = set()\r\n  for w in open('\/usr\/share\/dict\/words'):\r\n    word_set.add(w.strip().lower())\r\n  return word_set\r\n\r\ndef p1(word_set):\r\n  # Setup character position lookup table.\r\n  chars = string.ascii_lowercase\r\n  num_chars = len(chars)\r\n  letter_positions = dict([(c, i) for i, c in enumerate(chars)])\r\n\r\n  # Track the rotation that yields the most real words.\r\n  max_word_count = 0 \r\n  best_i = 1 \r\n  sentences = {}\r\n\r\n  # Try each rotation.\r\n  for i in xrange(1, 25):\r\n    temp_word_count = 0 \r\n    decoded = []\r\n\r\n    # Build up a decoded sentence.\r\n    for char in SENTENCE:\r\n      if char == ' ':\r\n        decoded.append(char)\r\n      else:\r\n        decoded.append(chars[(letter_positions[char.lower()] - i) % num_chars])\r\n    sentences[i] = ''.join(decoded)\r\n\r\n    # See how many words in the sentence exist in the word set.\r\n    for word in sentences[i].split(' '):\r\n      if word in word_set:\r\n        temp_word_count += 1\r\n    if temp_word_count > max_word_count:\r\n      best_i = i \r\n      max_word_count = temp_word_count\r\n\r\n  # Output our best guess.\r\n  print best_i, sentences[best_i]\r\n\r\ndef main(argv):\r\n  word_set = WordsToSet()\r\n  p1(word_set)\r\n\r\nif __name__ == '__main__':\r\n  main(sys.argv)\r\n<\/pre>\n<p>The second problem proved to be a bit more difficult. It couldn&#8217;t be brute forced since there were 18 &#8220;strips&#8221; of characters and thus 18! possible combinations (6,402,373,705,728,000 so if one could generate 1 every millisecond, it&#8217;d take 200,000+ years and that doesn&#8217;t even factor in the evaluation of whether the combination is gibberish). To make life easier, I first got rid of all the punctuation, lowercased all the characters, and converted the &#8220;strips&#8221; of characters into a Python list of lists. My idea was to pick the first strip, try the remaining 17 to see which one &#8220;fit&#8221; best before or after strip 0, then try the remaining 16 on the joined strips, etc. For best &#8220;fit&#8221; I would try to maximize the number of legitimate words which could be composed by the left or right most series of characters (whitespace delimited).<\/p>\n<p>Initially it seemed like counting a partial word match with fewer than 3 characters wouldn&#8217;t be useful e.g. &#8220;er&#8221; appears in tens of thousands of words. That wasn&#8217;t good enough to keep my program on track though, as it would get a few strips aligned correcting and then veer off track creating sequences of characters that were found in many words, but ended up being gibberish in this example. So I had to augment the solution to detect when a sequence of characters should appear at the beginning of a word (whitespace before the sequence, but not after) or when a sequence of characters should be matched exactly (whitespace before and after the sequence). In this case, I didn&#8217;t have to detect whether a sequence of characters should appear at the end of a word (whitespace after the sequence, but not before), but it would be trivial to add.<\/p>\n<p>Another necessary feature of my model was the ability to short-circuit if more than 1 row of characters would result in zero possible words. At first I had the threshold set at more than 0, but &#8220;shannon&#8221;, a last name as it turns out, is not in the dictionary. The code below takes about 90 seconds to solve the problem and could easily be optimized to reduce the number of loops or lookups in the dictionary, but I think those changes obscure the heart of the algorithm. Below is the code as well as a copy of the output generated by the program as it solved the problem.<\/p>\n<pre>\r\n#!\/usr\/bin\/python\r\n\r\nimport sys\r\nimport string\r\n\r\nSTRIPS = [\r\n  ['de',' ',' f','cl','nf','ed','au',' i','ti',' ','ma','ha','or','nn','ou',' s','on','nd','on'],\r\n  ['ry',' ','is','th','is',' b','eo','as',' ',' ','f ','wh',' o','ic',' t',' ',' ','he','h '],\r\n  ['ab',' ','la','pr','od','ge','ob',' m','an',' ','s ','is','el','ti','ng','il','d ','ua','c '],\r\n  ['he',' ','ea','of','ho',' m',' t','et','ha',' ',' t','od','ds','e ','ki',' c','t ','ng','br'],\r\n  ['wo','m ','to','yo','hi','ve','u ',' t','ob',' ','pr','d ','s ','us',' s','ul','le','ol','e '],\r\n  [' t','ca',' t','wi',' m','d ','th',' a','ma','l ','he',' p','at','ap','it','he','ti','le','er'],\r\n  ['ry','d ','un','th',' ','io','eo','n ','is',' ','bl','f ','pu','co','ic',' o','he','at','mm'],\r\n  ['hi',' ',' ','in',' ',' ',' t',' ',' ',' ',' ','ye',' ','ar',' ','s ',' ',' ',' ']\r\n]\r\n\r\ndef WordsToSet():\r\n  word_set = set()\r\n  for w in open('\/usr\/share\/dict\/words'):\r\n    word_set.add(w.strip().lower())\r\n  return word_set\r\n\r\n# Determine whether word is, starts with or contins prospective_word.\r\ndef WordMatch(word, prospective_word, exact, startswith):\r\n  if exact and prospective_word == word:\r\n    return 1\r\n  elif startswith and word.startswith(prospective_word):\r\n    return 1\r\n  elif not exact and not startswith and prospective_word in word:\r\n    return 1\r\n  return 0\r\n\r\ndef ProspectiveWord(i, j, row, prefix):\r\n  # Construct a new result based on the current result row, the ith row and,\r\n  # jth column of STRIPS and whether the new set of characters should be\r\n  # positioned before or after the current result row.\r\n  if prefix:\r\n    prospective_result = STRIPS[i][j] + row\r\n  else:\r\n    prospective_result = row + STRIPS[i][j]\r\n\r\n  words = prospective_result.split(' ')\r\n  prospective_word = ''\r\n  exact = False\r\n  startswith = False\r\n  if prefix:\r\n    if words[0] == '':  # There was a leading space.\r\n      prospective_word = words[1]\r\n      if len(words) > 2:  # And a space after prospective_word.\r\n        exact = True\r\n      else:\r\n        startswith = True\r\n    else:\r\n      prospective_word = words[0]\r\n  else:\r\n    if words[-1] == '':  # There was a trailing space.\r\n      prospective_word = words[-2]\r\n      if len(words) > 2:  # And a space before prospective_word.\r\n        exact = True\r\n    else:\r\n      prospective_word = words[-1]\r\n      if len(words) > 1:\r\n        startswith = True\r\n  return prospective_word, exact, startswith\r\n\r\ndef p2(word_set):\r\n  # Set the first column of STRIPS as the current result.\r\n  result = [row[0] for row in STRIPS]\r\n  # The order of j's, not used to solve the problem, but useful to print at the\r\n  # end. i.e. the order of the STRIPS.\r\n  strip_order = [0]\r\n  # Keep track of the unused j's i.e. columns of STRIPS\r\n  remaining_strips = set([j for j in xrange(1, len(STRIPS[0]))])\r\n\r\n  while len(remaining_strips) > 0:\r\n    # Make one pass placing column j of STRIPS before the current result and\r\n    # then another pass placing it after.\r\n    for prefix in [True, False]:\r\n      # Keep track of the best j and highest possible probablistic word count.\r\n      best_j = 0\r\n      best_word_count = 0\r\n      for j in remaining_strips:  # Columns\r\n        word_count = 0\r\n        # Keep track of the number of rows without probable words. Rows which are\r\n        # not checked i.e. without a prospective word of length > 2 or where\r\n        # STRIPS[i][j] is a space (doesn't add any new information) do not count.\r\n        rows_without_words = 0\r\n        for i, row in enumerate(result):  # Rows\r\n          # More than one row without probable words means this column is likely\r\n          # in the wrong position.\r\n          if rows_without_words > 1:\r\n            break\r\n          if STRIPS[i][j] == ' ':  # No new information, continue to next row.\r\n            continue\r\n          prospective_word, exact, startswith = ProspectiveWord(i, j, row, prefix)\r\n          # Too short a word\/word fragment, continue to next row.\r\n          if len(prospective_word) < 3:\r\n            continue\r\n\r\n          # Total up all the prospective word matches and update the state.\r\n          temp_word_count = 0\r\n          for word in word_set:\r\n            temp_word_count += WordMatch(word, prospective_word, exact, startswith)\r\n\r\n          if temp_word_count == 0:\r\n            rows_without_words += 1\r\n          else:\r\n            word_count += temp_word_count\r\n\r\n        # More than one row without probable words means this column is likely\r\n        # in the wrong position.\r\n        if rows_without_words > 1:\r\n          continue\r\n\r\n        if word_count > best_word_count:\r\n          best_word_count = word_count\r\n          best_j = j\r\n\r\n      if best_j != 0:\r\n        if prefix:\r\n          strip_order.insert(0, best_j)\r\n        else:\r\n          strip_order.append(best_j)\r\n        for i in xrange(len(result)):\r\n          if prefix:\r\n            result[i] = STRIPS[i][best_j] + result[i]\r\n          else:\r\n            result[i] = result[i] + STRIPS[i][best_j]\r\n        remaining_strips.remove(best_j)\r\n        print 'strip_order: ', strip_order\r\n        for row in result:\r\n          print row\r\n          print ' '\r\n\r\ndef main(argv):\r\n  word_set = WordsToSet()\r\n  p2(word_set)\r\n\r\nif __name__ == '__main__':\r\n  main(sys.argv)\r\n<\/pre>\n<p>Here&#8217;s what the output looks like so you can see it building up the solution:<\/p>\n<p>strip_order:  [6, 0]<br \/>\naude<br \/>\neory<br \/>\nobab<br \/>\n the<br \/>\nu wo<br \/>\nth t<br \/>\neory<br \/>\n thi<\/p>\n<p>strip_order:  [6, 0, 15]<br \/>\naude s<br \/>\neory<br \/>\nobabil<br \/>\n the c<br \/>\nu woul<br \/>\nth the<br \/>\neory o<br \/>\n this <\/p>\n<p>strip_order:  [3, 6, 0, 15]<br \/>\nclaude s<br \/>\ntheory<br \/>\nprobabil<br \/>\nof the c<br \/>\nyou woul<br \/>\nwith the<br \/>\ntheory o<br \/>\nin this <\/p>\n<p>strip_order:  [3, 6, 0, 15, 11]<br \/>\nclaude sha<br \/>\ntheory wh<br \/>\nprobabilis<br \/>\nof the cod<br \/>\nyou would<br \/>\nwith the p<br \/>\ntheory of<br \/>\nin this ye<\/p>\n<p>strip_order:  [3, 6, 0, 15, 11, 13]<br \/>\nclaude shann<br \/>\ntheory whic<br \/>\nprobabilisti<br \/>\nof the code<br \/>\nyou would us<br \/>\nwith the pap<br \/>\ntheory of co<br \/>\nin this year<\/p>\n<p>strip_order:  [3, 6, 0, 15, 11, 13, 18]<br \/>\nclaude shannon<br \/>\ntheory which<br \/>\nprobabilistic<br \/>\nof the code br<br \/>\nyou would use<br \/>\nwith the paper<br \/>\ntheory of comm<br \/>\nin this year <\/p>\n<p>strip_order:  [3, 6, 0, 15, 11, 13, 18, 2]<br \/>\nclaude shannon f<br \/>\ntheory which is<br \/>\nprobabilistic la<br \/>\nof the code brea<br \/>\nyou would use to<br \/>\nwith the paper t<br \/>\ntheory of commun<br \/>\nin this year  <\/p>\n<p>strip_order:  [3, 6, 0, 15, 11, 13, 18, 2, 14]<br \/>\nclaude shannon fou<br \/>\ntheory which is t<br \/>\nprobabilistic lang<br \/>\nof the code breaki<br \/>\nyou would use to s<br \/>\nwith the paper tit<br \/>\ntheory of communic<br \/>\nin this year   <\/p>\n<p>strip_order:  [3, 6, 0, 15, 11, 13, 18, 2, 14, 17]<br \/>\nclaude shannon found<br \/>\ntheory which is the<br \/>\nprobabilistic langua<br \/>\nof the code breaking<br \/>\nyou would use to sol<br \/>\nwith the paper title<br \/>\ntheory of communicat<br \/>\nin this year    <\/p>\n<p>strip_order:  [3, 6, 0, 15, 11, 13, 18, 2, 14, 17, 5]<br \/>\nclaude shannon founded<br \/>\ntheory which is the b<br \/>\nprobabilistic language<br \/>\nof the code breaking m<br \/>\nyou would use to solve<br \/>\nwith the paper titled<br \/>\ntheory of communicatio<br \/>\nin this year     <\/p>\n<p>strip_order:  [3, 6, 0, 15, 11, 13, 18, 2, 14, 17, 5, 7]<br \/>\nclaude shannon founded i<br \/>\ntheory which is the bas<br \/>\nprobabilistic language m<br \/>\nof the code breaking met<br \/>\nyou would use to solve t<br \/>\nwith the paper titled  a<br \/>\ntheory of communication<br \/>\nin this year      <\/p>\n<p>strip_order:  [3, 6, 0, 15, 11, 13, 18, 2, 14, 17, 5, 7, 4]<br \/>\nclaude shannon founded inf<br \/>\ntheory which is the basis<br \/>\nprobabilistic language mod<br \/>\nof the code breaking metho<br \/>\nyou would use to solve thi<br \/>\nwith the paper titled  a m<br \/>\ntheory of communication<br \/>\nin this year       <\/p>\n<p>strip_order:  [3, 6, 0, 15, 11, 13, 18, 2, 14, 17, 5, 7, 4, 12]<br \/>\nclaude shannon founded infor<br \/>\ntheory which is the basis o<br \/>\nprobabilistic language model<br \/>\nof the code breaking methods<br \/>\nyou would use to solve this<br \/>\nwith the paper titled  a mat<br \/>\ntheory of communication  pu<br \/>\nin this year        <\/p>\n<p>strip_order:  [3, 6, 0, 15, 11, 13, 18, 2, 14, 17, 5, 7, 4, 12, 10]<br \/>\nclaude shannon founded informa<br \/>\ntheory which is the basis of<br \/>\nprobabilistic language models<br \/>\nof the code breaking methods t<br \/>\nyou would use to solve this pr<br \/>\nwith the paper titled  a mathe<br \/>\ntheory of communication  publ<br \/>\nin this year         <\/p>\n<p>strip_order:  [3, 6, 0, 15, 11, 13, 18, 2, 14, 17, 5, 7, 4, 12, 10, 8]<br \/>\nclaude shannon founded informati<br \/>\ntheory which is the basis of<br \/>\nprobabilistic language models an<br \/>\nof the code breaking methods tha<br \/>\nyou would use to solve this prob<br \/>\nwith the paper titled  a mathema<br \/>\ntheory of communication  publis<br \/>\nin this year          <\/p>\n<p>strip_order:  [3, 6, 0, 15, 11, 13, 18, 2, 14, 17, 5, 7, 4, 12, 10, 8, 16]<\/p>\n<p>theory which is the basis of<br \/>\nprobabilistic language models and<br \/>\nof the code breaking methods that<br \/>\nyou would use to solve this proble<br \/>\nwith the paper titled  a mathemati<br \/>\ntheory of communication  publishe<br \/>\nin this year           <\/p>\n<p>strip_order:  [3, 6, 0, 15, 11, 13, 18, 2, 14, 17, 5, 7, 4, 12, 10, 8, 16, 1]<br \/>\nclaude shannon founded information<br \/>\ntheory which is the basis of<br \/>\nprobabilistic language models and<br \/>\nof the code breaking methods that<br \/>\nyou would use to solve this problem<br \/>\nwith the paper titled  a mathematica<br \/>\ntheory of communication  published<br \/>\nin this year            <\/p>\n<p>strip_order:  [3, 6, 0, 15, 11, 13, 18, 2, 14, 17, 5, 7, 4, 12, 10, 8, 16, 1, 9]<br \/>\nclaude shannon founded information<br \/>\ntheory which is the basis of<br \/>\nprobabilistic language models and<br \/>\nof the code breaking methods that<br \/>\nyou would use to solve this problem<br \/>\nwith the paper titled  a mathematical<br \/>\ntheory of communication  published<br \/>\nin this year             <\/p>\n","protected":false},"excerpt":{"rendered":"<p>Towards the end of last year I signed up for the &#8220;basic&#8221; track of the Stanford Engineering Introduction To Artificial Intelligence class, mainly out of curiosity, but also because the instructors (Sebastian Thrun and Peter Norvig) and I share the same employer. I intended to follow along at a leisurely pace, unsure if I would [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[23,11,17],"_links":{"self":[{"href":"https:\/\/www.madpickles.org\/rokjoo\/wp-json\/wp\/v2\/posts\/596"}],"collection":[{"href":"https:\/\/www.madpickles.org\/rokjoo\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.madpickles.org\/rokjoo\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.madpickles.org\/rokjoo\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.madpickles.org\/rokjoo\/wp-json\/wp\/v2\/comments?post=596"}],"version-history":[{"count":12,"href":"https:\/\/www.madpickles.org\/rokjoo\/wp-json\/wp\/v2\/posts\/596\/revisions"}],"predecessor-version":[{"id":609,"href":"https:\/\/www.madpickles.org\/rokjoo\/wp-json\/wp\/v2\/posts\/596\/revisions\/609"}],"wp:attachment":[{"href":"https:\/\/www.madpickles.org\/rokjoo\/wp-json\/wp\/v2\/media?parent=596"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.madpickles.org\/rokjoo\/wp-json\/wp\/v2\/categories?post=596"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.madpickles.org\/rokjoo\/wp-json\/wp\/v2\/tags?post=596"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}