{"id":494,"date":"2014-01-02T10:45:14","date_gmt":"2014-01-02T10:45:14","guid":{"rendered":"http:\/\/www.ai.rug.nl\/SocialCognition\/?p=494"},"modified":"2015-04-24T06:48:47","modified_gmt":"2015-04-24T06:48:47","slug":"rock-paper-scissors","status":"publish","type":"post","link":"https:\/\/www.ai.rug.nl\/SocialCognition\/2014\/01\/02\/rock-paper-scissors\/","title":{"rendered":"Rock-paper-scissors"},"content":{"rendered":"<p>The Java applet on this page shows the implementation of simulated agents playing the game of <a href=\"#game\">rock-paper-scissors<\/a>. These agents differ in their ability to make use of <a href=\"#tom\">theory of mind<\/a>, the human ability that allows us to reason about what other people know and believe. The controls for this applet are explained at the <a href=\"#controls\">bottom of this page<\/a>. You can also download the <a href=\"https:\/\/www.ai.rug.nl\/SocialCognition\/wp-content\/uploads\/RockPaperScissors.jar\">offline version<\/a> of the applet.\n<\/p>\n<p align=center>\n<iframe width=750 height=650 src=\"http:\/\/harmendeweerd.nl\/scripts\/rps_script.html\"><\/iframe><br \/>\n<!--applet code=\"GameViewer\/RPSFrame.class\" archive=\"https:\/\/www.ai.rug.nl\/SocialCognition\/wp-content\/uploads\/RockPaperScissors.jar\" width=480 height=460>\n<\/applet-->\n<\/p>\n<p><a name=\"game\"><\/a><\/p>\n<h3>Game outline<\/h3>\n<table border=0 align=center>\n<tr>\n<td align=center>\n<img decoding=\"async\" src=\"https:\/\/www.ai.rug.nl\/SocialCognition\/wp-content\/uploads\/scissors_beats_paper.png\" height=150>\n<\/td>\n<\/tr>\n<tr>\n<td><i>Figure 1: In rock-paper-scissors, scissors beats paper<\/i><\/td>\n<\/tr>\n<\/table>\n<p>Rock-paper-scissors is a two-player game in which players simultaneously choose to play either <i>rock<\/i>, <i>paper<\/i> or <i>scissors<\/i>. If the two players made the same choice, the game ends in a tie. However, if one player chose <i>scissors<\/i>, while the other chose <i>paper<\/i>, the player that chose scissors wins (see Figure 1). In the same way, <i>rock<\/i> wins from <i>scissors<\/i>, and <i>paper<\/i> wins from <i>rock<\/i>.\n<\/p>\n<p>\nAccording to game theory, the only stable strategy when playing rock-paper-scissors is to randomly choose one of the possibilities. After all, if you play according to some pattern, the other player might learn that pattern over many repeated games, and exploit that knowledge. Playing randomly makes sure that the opponent cannot learn any patterns in the way you play the game. Although this strategy works well, people are not very good at playing randomly. For example, people usually avoid playing <i>rock<\/i> when they have just played <i>rock<\/i> two times in a row, even though this should not matter in truly random play. Also, if there are some people that are not playing randomly, smart players may be able to exploit this and get a higher score than a random player.\n<\/p>\n<p><br clear=\"all\"><br \/>\n<a name=\"tom\"><\/a><\/p>\n<h3>Theory of mind<\/h3>\n<p>\nIn game settings, people often consider what their opponents know and believe, by making use of what is known as <i>theory of mind<\/i>. The computer agents in the applet on this page also make use of theory of mind to predict what their opponent is going to do. The applet allows the user to restrict agents in their ability to make use of theory of mind. This way, we can determine whether higher orders of theory of mind allows agents to win more often in rock-paper-scissors.\n<\/p>\n<p align=center>\n<a name=\"fig2\"><\/a><\/p>\n<table border=0 align=center>\n<tr>\n<td align=center>\n<a href=\"https:\/\/www.ai.rug.nl\/SocialCognition\/wp-content\/uploads\/ex_tom_0_v4.png\"><img decoding=\"async\" src=\"https:\/\/www.ai.rug.nl\/SocialCognition\/wp-content\/uploads\/ex_tom_0_v4.png\" width=100%><\/a>\n<\/td>\n<\/tr>\n<tr>\n<td align=center><i>Figure 2: The blue zero-order theory of mind agent tries to learn patterns in the behaviour of his opponent.<\/i><\/table>\n<\/p>\n<p>\nThe lowest possible order of theory of mind is <i>zero-order<\/i> theory of mind. Zero-order theory of mind agents try to model their opponent through patterns of behaviour. For example, if the opponent has always played paper before, the zero-order theory of mind agent believes that she is going to play paper again (see <a href=\"#fig2\">Figure 2<\/a>). In the rock-paper-scissors applet, the <font color=\"red\">red<\/font> bars indicate the agent&#8217;s zero-order beliefs, which show how likely the agent believes it to be that his opponent is going to play (R)ock, (P)aper or (S)cissors. This way, when a zero-order theory of mind agent sees that his opponent is playing paper more than average, he will try to take advantage of that by playing paper more than average.\n<\/p>\n<p align=center>\n<a name=\"fig3\"><\/a><\/p>\n<table border=0 align=center>\n<tr>\n<td align=center>\n<a href=\"https:\/\/www.ai.rug.nl\/SocialCognition\/wp-content\/uploads\/ex_tom_1_v4.png\"><img decoding=\"async\" src=\"https:\/\/www.ai.rug.nl\/SocialCognition\/wp-content\/uploads\/ex_tom_1_v4.png\" width=100%><\/a>\n<\/td>\n<\/tr>\n<tr>\n<td align=center><i>Figure 3: If the blue agent has first-order beliefs, he believes that his red opponent may be trying to learn and exploit patterns in his behaviour. By looking at the patterns in his own behaviour, the blue agent predicts how the red opponent will try to exploit these patterns.<\/i><\/table>\n<\/p>\n<p>\nA zero-order theory of mind agent tries to learn patterns in the behaviour of his opponent, but does not realize that his opponent could be doing the same thing. A <i>first-order<\/i> theory of mind agent realizes that his opponent may be a zero-order theory of mind agent. He tries to predict what his opponent is going to do by putting himself in her position. He looks at the game from the point of view of his opponent to determine what he would do if the situation were reversed, and use that as a prediction of his opponent&#8217;s action. For example, when the agent realizes he has been playing <i>paper<\/i> more than average, he believes his opponent may try and take advantage of this by playing <i>scissors<\/i> more often. If that is true, the agent can take advantage of that by playing <i>rock<\/i> (see  <a href=\"#fig3\">Figure 3<\/a>). In the applet, the agent&#8217;s first-order beliefs are shown as <font color=\"green\">green<\/font> bars. They show what the agent believes how likely his opponent thinks it is that he himself is going to play one of the three possible moves.\n<\/p>\n<p align=center>\n<a name=\"fig4\"><\/a><\/p>\n<table border=0 align=center>\n<tr>\n<td align=center>\n<a href=\"https:\/\/www.ai.rug.nl\/SocialCognition\/wp-content\/uploads\/ex_tom_2_v5.png\"><img decoding=\"async\" src=\"https:\/\/www.ai.rug.nl\/SocialCognition\/wp-content\/uploads\/ex_tom_2_v5.png\" width=100%><\/a>\n<\/td>\n<\/tr>\n<tr>\n<td align=center><i>Figure 4: If the blue agent has second-order beliefs, he believes that his red opponent believes that he himself is trying to learn and exploit patterns in her behaviour. This allows him to anticipate how the red opponent will try to exploit his behaviour.<\/i><\/table>\n<\/p>\n<p>\nA <i>second-order<\/i> theory of mind agent takes his reasoning one step further, and realizes that his opponent may be a first-order theory of mind agent. He puts himself into the position of his opponent, but also believes that she might be putting herself into his position. For example, if the agent realizes his opponent is playing <i>paper<\/i> more than average, he realizes he could take advantage of that by playing <i>scissors<\/i> more often. A second-order theory of mind agent thinks that his opponent may be expecting this, and therefore that she will play <i>rock<\/i> to take advantage of the way he behaves. If that is true, the agent should start playing <i>paper<\/i> more often himself (see <a href=\"#fig4\">Figure 4<\/a>). In the applet, the <font color=\"blue\">blue<\/font> bars indicate an agent&#8217;s second-order beliefs.\n<\/p>\n<p>\nAlthough the agents in the applet have theory of mind, they do not remember the choices of their opponent. Instead, when they see the outcome of a game, they form beliefs about what the opponent is going to do next time. After this, they forget what they saw. This means that the agents in our applet can only look at very simple patterns of behaviour. As an alternative type of agent, a <i>high memory<\/i> agent is a zero-order theory of mind agent that also remembers what was played in the previous round. That is, the high memory agent forms beliefs about what his opponent is going to do in reaction to the outcome of the last game. For example, a high memory agent may believe that his opponent is going to play <i>rock<\/i> if he just played <i>scissors<\/i>, but not if he just played <i>paper<\/i>.\n<\/p>\n<p><a name=\"controls\"><\/a><\/p>\n<h3>Controls<\/h3>\n<p>\nWith the script, you can see how agents perform better when their theory of mind level increases. In addition, you can test your ability against computer agents, and see what agents believe you are doing when playing rock-paper-scissors.<\/p>\n<ul>\n<li> <i>Player 1\/2 theory of mind<\/i>: The radio buttons determine the order of theory of mind of the two players. Players can be any order of theory of mind up to fourth-order. Additionally, the second player can be controlled by a human user.<\/li>\n<li> <i>Learning speed<\/i>: Determines how quickly an agent changes his beliefs based on new information. A learning speed of 0.0 means that an agent does not learn at all, and will always do the same thing. An agent with learning speed 1.0 on the other hand believes that the previous game gives him all the information he needs to predict his opponent&#8217;s behaviour. Agents do not try to model the learning speed of their opponent; if the two agents have different learning speeds, they will not be able to correctly model the beliefs of their opponent.<\/li>\n<li> <i>Reset game<\/i>: Resets the game to the start situation. The score and accuracy information is reset to zero as well.<\/li>\n<li> <i>Play round<\/i>: Play one game of rock-paper-scissors. This can only be done when player two is not user-controlled.<\/i>\n<li> <i>Rock<\/i>, <i>paper<\/i> and <i>scissors<\/i>: When player two is user-controlled, selecting one of the three possible moves plays one game, with player two&#8217;s choice. <\/li>\n<li> <i>Show mental content<\/i>: A human player can use the graphs to determine what the agent will do next, or what a computer agent would do next if he were the one to play next. For a human player, the game is more challenging if the graphs are not visible. Uncheck the box to hide mental content information from the graphs.<\/li>\n<\/ul>\n<p>\nWith the applet, you can see how agents perform better when their theory of mind level increases. But for a simple game like rock-paper-scissors, theory of mind agents do not outperform high memory agents (see also <a href=\" https:\/\/www.ai.rug.nl\/SocialCognition\/2011\/06\/16\/limited-bidding\/\">Limited Bidding<\/a>).<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The Java applet on this page shows the implementation of simulated agents playing the game of rock-paper-scissors. These agents differ in their ability to make use of theory of mind, the human ability that allows us to reason about what &hellip; <a href=\"https:\/\/www.ai.rug.nl\/SocialCognition\/2014\/01\/02\/rock-paper-scissors\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-494","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/www.ai.rug.nl\/SocialCognition\/wp-json\/wp\/v2\/posts\/494","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.ai.rug.nl\/SocialCognition\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.ai.rug.nl\/SocialCognition\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.ai.rug.nl\/SocialCognition\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/www.ai.rug.nl\/SocialCognition\/wp-json\/wp\/v2\/comments?post=494"}],"version-history":[{"count":12,"href":"https:\/\/www.ai.rug.nl\/SocialCognition\/wp-json\/wp\/v2\/posts\/494\/revisions"}],"predecessor-version":[{"id":565,"href":"https:\/\/www.ai.rug.nl\/SocialCognition\/wp-json\/wp\/v2\/posts\/494\/revisions\/565"}],"wp:attachment":[{"href":"https:\/\/www.ai.rug.nl\/SocialCognition\/wp-json\/wp\/v2\/media?parent=494"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.ai.rug.nl\/SocialCognition\/wp-json\/wp\/v2\/categories?post=494"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.ai.rug.nl\/SocialCognition\/wp-json\/wp\/v2\/tags?post=494"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}