Multi-Agent Systems projectSymmetric HangmanJefta Saija & Ruud HenkenApril 2011 Contents1. IntroductionThe original game of Hangman is a two player game. One player (player 1) thinks of a word and the other player (player 2) is to guess the word. The word is represented by underscores. Player 2 is to guess the word by asking single letters. If the letter occurs in the word, player 2 gets to know the positions of the occurrences in the word. If the letter does not occur in the word, player 1 draws an element of the hangman diagram (see figure below). The game ends if:
![]() 2. Problem descriptionWe want to model a symmetric version of hangman. That is, both players need to complete a word and prevent their hangman diagram to be completed. The symmetric version is more complicated than the non-symmetric version. If one players asks a letter in the symmetric version, both players reveal the occurrences of the letter in their word. This means that both players have to be even more careful in asking letters. You do not want to help your opponent to win the game by asking a specific letter unless it is strictly necessary.This requires knowledge of the other player's word. Similar to the non-symmetric version of the game both players know the word the opponent has to guess. So, both players only need to reason about the letter occurrences in their own word, since they simply know the letter occurrences in the other player's word. Knowledge and possible worlds Both players are using the same lexical database. That is, they share common knowledge on all possible words. Based on the following knowledge, both players will turnwise need to select a letter:
A great deal of effort is in choosing the right word for your opponent. By choosing a difficult word for your opponent the chance of winning the game increases. But what is a 'difficult' word? Both players do not know from each other which strategy they are using in selecting a word. Therefore they have to take into account the whole dataset when reasoning about possible worlds. Stopping criteria The symmetric version of the game ends if:
3. ImplementationFor the winning strategy we will assume that both players are perfect logicians.Choosing a word for your opponent The game is started by both players choosing a word that the opponent player has to guess. To select a 'difficult' word is not a trivial solution. It strongly depends on the opponent how good the word is. For example while playing against another perfect logician, your opponent might reason that in selecting a word you would have selected it from a certain subset of the whole lexical database. This would greatly reduce the opponent's possible worlds. Thus would it be more effective not to choose a word from this subset? To overcome this problem we defined four different selection criteria. Each of them separating the lexical database in a certain subset. Both players will use a combination of these four criteria to generate a subset of the whole lexical database. Thereafter a word is randomly selected out of this subset. In this way a player can never know which strategy is used by the opponent to select the word that the player has to guess. Both players therefore need to consider the whole lexical database as possible worlds. The four selection criteria are:
Once both players have selected a word for eachother, they can start guessing their own word. Based on all the available knowledge they need to select a letter. Since they do not know the word they are looking for, we need to devise a method that selects a letter which will help win the game. At this point it is good to realise letter selection can be represented like a search tree. A path in the tree represents the guesses of a player (NOT the word itself). At each point in the tree a letter must be guessed based on available knowledge. We want to optimize this guess of the player. The possible worlds are represented by the possible words and the optimal guessed letter has to optimally reduce the possible worlds. We will use two tools for this: the information gain (entropy) and letter frequency. Entropy is a measure of disorder, the letter frequency is the frequency of occurrence of a specific letter in the list of possible words. Entropy The entropy is calculated as follows [1]: Entropy = - P(+) * Log2(P(+)) - P(-) * Log2(P(-)) With P(+) being the number of positive examples (words containing the letter) and P(-) being the number of negative examples (words not containing the letter). Entropy is a measure of impurity and is always between 0 and 1, with 0 being the state of most disorder/chaos. Entropy is a very suitable tool for separating two classes. Ideally, a property (in our case: a word contains a specific letter or it does not) splits the class in 50% on the one side and 50% on the other side. The entropy value in that case would be 1. Logically you would want to select the letter with the highest entropy, since the letter with the highest entropy has the highest information gain. In our implementation entropy is used as the main decision criteria (see the section below under 'Selection strategy' for the full strategy). Frequency list The frequency list is generated by taking into account all remaining possible worlds and counting each instance of a letter. The frequency list is sorted so that the letter with the highest occurrence is at the top of the list. The letter(s) with the lowest (or zero) occurrence is/are at the bottom of the list. Selection strategy The player starts his turn by computing the entropy of each letter for both himself and his opponent. Given the entropy of each letter and the frequency list, the player will make a decision. The player will always try to select a letter with an information gain as high as possible while taking into account the information gain of the other player too. If the information gain of the letter with the highest information gain for the current player is higher than the information gain for this letter for the opponent player, the current player will ask this letter. If the letter happens to be of more value for the opponent player, the current player will skip to the next letter until it runs out of possible letters or it finds a letter with higher entropy than the opponent. It might happen that for each letter the opponent has a higher or equal information gain (the player runs out of possible letters). That is, the player knows that it does not matter which letter it will ask, it will always help the opponent even more or equal. What we can do then is to select a letter solely based on the player's own entropy and frequency list. The player will thus select the letter with the highest entropy. If there are several letters with the same highest information gain, the letter with the highest letter frequency is selected. There is one exception to this, namely if the current player has only one world left (current player knows the word it is guessing). The entropy is then 0 for all remaining letters. The entropy is therefore not of much value. The current player will start asking all the remaining letters of his word, starting by the letters which do not occur at the opponents word. If every letter is contained in the opponents word it does not matter which letter to choose first. Below is the winning strategy in pseudo-code:
4. Output and resultsAssume the following configuration:
Player 1 starts the game The only constraint at the start of the game is the length of the word (no letters have been guessed yet). Based on this information and using the lexical database player 1 will calculate the entropy for each letter and generate a frequency list. The entropy of the letters 'b', 'd', 'h', 'k', 'm' and 'v' are equal to 0.650. All the other letters have an entropy value of 0. Player 1 knows that this holds for player 2 too. Player 1 therefore needs to choose between 'b', 'd', 'h', 'k', 'm' and 'v'. Player 1 will then guess 'b'. The letter 'b' is in none of the players words. p1: _ _ _ _ _ p2: _ _ _ _ _ Guessed: b Turn player 2 The only extra information player 2 has, is that the letter 'b' does not occur in any of the players words. There are no further constraints. Based on this information player 2 will calculate the entropy of all remaining letters and generate a frequency list. The entropy of the letters 'd', 'h', 'k', 'm' and 'v' are equal to 0.722. All the other letters have an entropy value of 0. Player 2 knows this holds for player 1 too. Player 2 therefore needs to choose between 'd', 'h', 'k', 'm' and 'v'. Player 2 will then guess 'd'. The letter 'd' is in none of the players words. p1: _ _ _ _ _ p2: _ _ _ _ _ Guessed: b, d Turn player 1 Again the guessed letter did not occur in either one of the words. Player 1 will calculate the entropy and frequency list based on this information. The entropy of the letters 'h', 'k', 'm' and 'v' are equal to 0.811. All the other letters have an entropy value of 0. Player 1 knows that the information gain between all remaining letters is higher for or equal to player 2. Player 1 therefore needs to choose between 'h', 'k', 'm' and 'v'. Player 1 will then guess 'h'. The letter 'h' is in none of the players words. p1: _ _ _ _ _ p2: _ _ _ _ _ Guessed: b, d, h Turn player 2 Again the guessed letter did not occur in either one of the words. Player 2 wil now calculate the entropy of the remaining letters and generate a frequency list based on this information. The entropy of letters 'k', 'm' and 'v' are equal to 0.918. All the other letters have an entropy value of 0. Player 2 knows that the information gain between all remaining letters is higher for or equal to player 1. Player 2 therefore needs to choose between 'k', 'm' and 'v'. Player 2 will then guess 'k'. The letter 'k' is in player 1 his word. p1: k _ _ _ _ p2: _ _ _ _ _ Guessed: b, d, h, k Turn player 1 It turns out that player 1 is rather lucky. After the guess of player 2, player 1 now has only one possible world! Player 1 knows the word to guess is 'korst'. Player 1 also knows that all the remaining letters are contained in the word of player 2. And so it does not matter which of the remaining letters player 1 will choose first. The remaining letters are 'o', 'r', 's' and 't'. Player 1 will then guess 'o'. The letter 'o' is in both players' words. p1: k o _ _ _ p2: _ o _ _ _ Guessed: b, d, h, k, o Turn player 2 Player 2 still does not know his own word. Though after the turn of player 1, player 2 now has the letter 'o' of his own word. Based on the new information, player 2 calculates the entropy for the remaining letters and generates a frequency list. The letter 'm' has an entropy value of 1. Player 2 knows the information gain of asking letter 'm' is higher for player 2 than for player 1. Player 2 will therefore guess letter 'm'. The letter 'm' is in none of the players words. p1: k o _ _ _ p2: _ o _ _ _ Guessed: b, d, h, k, m, o Turn player 1 Still, player 1 knows his word is 'korst'. Player 1 also knows all remaining letters are in the word of player 2 and therefore it does not matter which remaining letter to choose. The remaining letters for player 1 are 'r', 's' and 't'. Player 1 will then guess letter 'r'. The letter 'r' is in both players' words. p1: k o r _ _ p2: _ o r _ _ Guessed: b, d, h, k, m, o, r Turn player 2 Since there are no other words left, player 2 knows his word to guess is 'vorst'. The remaining letters for player 2 are 'v', 's', and 't'. Player 2 knows his word contains the letter 'v' and the word of player 1 does not. Player 2 will therefore guess the letter 'v'. The letter 'v' is in player 2 his word. p1: k o r _ _ p2: v o r _ _ Guessed: b, d, h, k, m, o, r, v Turn player 1 Still, player 1 knows his words is 'korst'. Player 1 also knows all remaining letters are in the word of player 2 and therefore it does not matter which remaining letter to choose. The remaining letters for player 1 are 's' and 't'. Player 1 will then guess letter 's'. The letter 's' is in both players' words. p1: k o r s _ p2: v o r s _ Guessed: b, d, h, k, m, o, r, s, v Turn player 2 Player 2 knows his word to guess is 'vorst'. The remaining letter of player 2 is 't'. Player 2 also knows this remaining letter is contained in the word of player 1. The only thing player 2 can then do is to guess letter 't'. Player 2 will therefore guess letter 't'. The letter 't' is in both players' words. p1: k o r s t p2: v o r s t Guessed: b, d, h, k, m, o, r, s, t, v Now both players have completed their word. The games ends in a draw! 5. DocumentsHangman.zipThe file above contains the source file and data files to compile and run the implementation. It contains a folder called src which needs to be extracted. The folder src contains the source file Hangman.java. It can be compiled and run by typing: javac Hangman.java Before running the code it is necessary to check whether the folders data and img are also extracted from Hangman.zip and are in the same folder as Hangman.java. Then run the code by typing: java Hangman Lexical database In order to change the lexical database, please change the variable dataFileName at line 23 of Hangman.java. The wordlist files are currently stored in the folder data. About the GUI On the total left of the GUI the frequency list of player 1 is shown. On the right the frequency list of player 2 is shown. Both lists are sorted, indicating the top letter has the highest probability of being in the word and at the bottom indicating the lowest probability. A Green letter indicates the player wants to select this letter at his turn. Black is used for all the other letters. At the start of the game an option screen is shown. Please select the game mode and word length before clicking 'ok'. The game has three game modes:
6. ReferencesEntropy[1] Alpaydin, E. (2010). Introduction to Machine Learning (pp. 189). Massachusetts: MIT Press. ISBN 978-0-262-01243-0 Wordlists
|