Game Theory 101: The Complete TextbookBy William Spaniel Copyright William Spaniel, 2011-2013.All rights reserved. AcknowledgementsI thank Varsha Nair for her revisions as I compiled this book. I am alsoindebted to Kenny Oyama and Matt Whitten for their further suggestions. Ioriginally learned game theory from Branislav Slantchev, John Duggan, andMark Fey.Please report possible errors to williamspaniel@gmail.com. I am grateful tothose who have already given me feedback through this medium, and Iencourage you to comment. About the AuthorWilliam Spaniel is a PhD candidate in political science at the University ofRochester, creator of the popular YouTube series Game Theory 101, and founderof gametheory101.com. You can email him at williamspaniel@gmail.com orfollow him on Twitter @gametheory101. Table of ContentsLESSON 1.1: THE PRISONER’S DILEMMA AND STRICT DOMINANCELESSON 1.2: ITERATED ELIMINATION OF STRICTLY DOMINATED STRATEGIESLESSON 1.3: THE STAG HUNT, PURE STRATEGY NASH EQUILIBRIUM, AND BESTRESPONSESLESSON 1.4: DOMINANCE AND NASH EQUILIBRIUMLESSON 1.5: MATCHING PENNIES AND MIXED STRATEGY NASH EQUILIBRIUMLESSON 1.6: CALCULATING PAYOFFSLESSON 1.7: STRICT DOMINANCE IN MIXED STRATEGIESLESSON 1.8: THE ODD RULE AND INFINITELY MANY EQUILIBRIALESSON 2.1: GAME TREES AND SUBGAME PERFECT EQUILIBRIUMLESSON 2.2: BACKWARD INDUCTIONLESSON 2.3: MULTIPLE SUBGAME PERFECT EQUILIBRIALESSON 2.4: MAKING THREATS CREDIBLELESSON 2.5: COMMITMENT PROBLEMSLESSON 2.6: BACKWARD INDUCTION WITHOUT A GAME TREELESSON 2.7: PROBLEMS WITH BACKWARD INDUCTIONLESSON 2.8: FORWARD INDUCTIONLESSON 3.1: PROBABILITY DISTRIBUTIONSLESSON 3.2: MIXED STRATEGY NASH EQUILIBRIA IN GENERALIZED GAMESLESSON 3.3: KNIFE-EDGE EQUILIBRIALESSON 3.4: COMPARATIVE STATICSLESSON 3.5: GENERALIZING MIXED STRATEGY NASH EQUILIBRIUMLESSON 3.6: ROCK-PAPER-SCISSORSLESSON 4.1: INFINITE STRATEGY SPACES, SECOND PRICE AUCTIONS, DUELS, AND THEMEDIAN VOTER THEOREMMORE FROM WILLIAM SPANIEL Lesson 1.1: The Prisoner’s Dilemma and StrictDominanceAt its core, game theory is the study of strategic interdependence—that is,situations where my actions affect both my welfare and your welfare and viceversa. Strategic interdependence is tricky, as actors need to anticipate, act, andreact. Blissful ignorance will not cut it.The prisoner’s dilemma is the oldest and most studied model in game theory,and its solution concept is also the simplest. As such, we will start with it. Twothieves plan to rob an electronics store. As they approach the backdoor, thepolice arrest them for trespassing. The cops suspect that the pair planned tobreak in but lack the evidence to support such an accusation. They thereforerequire a confession to charge the suspects with the greater crime.Having studied game theory in college, the interrogator throws them into theprisoner’s dilemma. He individually sequesters both robbers and tells each ofthem the following:We are currently charging you with trespassing, which implies a one monthjail sentence. I know you were planning on robbing the store, but right now Icannot prove it—I need your testimony. In exchange for your cooperation, I willdismiss your trespassing charge, and your partner will be charged to the fullestextent of the law: a twelve month jail sentence.I am offering your partner the same deal. If both of you confess, yourindividual testimony is no longer as valuable, and your jail sentence will beeight months each.If both criminals are self-interested and only care about minimizing their jailtime, should they take the interrogator’s deal? 1.1.1: Solving the Prisoner’s DilemmaThe story contains a lot of information. Luckily, we can condense everythingwe need to know into a simple matrix:We will use this type of game matrix regularly, so it is important tounderstand how to interpret it. There are two players in this game. The firstplayer’s strategies (keep “quiet” and “confess”) are in the rows, and the secondplayer’s strategies are in the columns. The first player’s payoffs are listed firstfor each outcome, and the second player’s are listed second. For example, if thefirst player keeps quiet and the second player confesses, then the game ends inthe top right set of payoffs; the first player receives twelve months of jail timeand the second player receives zero. Finally, as a matter of convention, we referto the first player as a man and the second player as a woman; this will allow usto utilize pronouns like “he” and “she” instead of endlessly repeating “player 1”and “player 2.”Which strategy should each player choose? To see the answer, we must lookat each move in isolation. Consider the game from player 1’s perspective.Suppose he knew player 2 will keep quiet. How should he respond?Let’s focus on the important information in that context. Since player 1 onlycares about his time in jail, we can block out player 2’s payoffs with questionmarks:Player 1 should confess. If he keeps quiet, he will spend one month in jail.But if he confesses, he walks away. Since he prefers less jail time to more jailtime, confession produces his best outcome.Note that player 2’s payoffs are completely irrelevant to player 1’s decisionin this context—if he knows that she will keep quiet, then he only needs to lookat his own payoffs to decide which strategy to pick. Thus, the question markscould be any number at all, and player 1’s optimal decision given player 2’smove will remain the same.On the other hand, suppose player 1 knew that player 2 will confess. Whatshould he do? Again, the answer is easier to see if we only look at the relevantinformation: Confession wins a second time: confessing leads to eight months of jail time,whereas silence buys twelve. So player 1 would want to confess if player 2confesses.Putting these two pieces of information together, we reach an importantconclusion—player 1 is better off confessing regardless of player 2’s strategy!Thus, player 1 can effectively ignore whatever he thinks player 2 will do, sinceconfessing gives him less jail time in either scenario.Let’s switch over to player 2’s perspective. Suppose she knew that player 1will keep quiet, even though we realize he should not. Here is her situation:As before, player 2 should confess, as she will shave a month off her jailsentence if she does so.Finally, suppose she knew player 1 will confess. How should she respond?Unsurprisingly, she should confess and spend four fewer months in jail.Once more, player 2 prefers confessing regardless of what player 1 does.Thus, we have reached a solution: both players confess, and both players spendeight months in jail. The justice system has triumphed, thanks to theinterrogator’s savviness.This outcome perplexes a lot of people new to the field of game theory.Compare the <quiet, quiet> outcome to the <confess, confess> outcome:Looking at the game matrix, people see that the <quiet, quiet> outcomeleaves both players better off than the <confess, confess> outcome. They thenwonder why the players cannot coordinate on keeping quiet. But as we just saw,promises to remain silent are unsustainable. Player 1 wants player 2 to keepquiet so when he confesses he walks away free. The same goes for player 2. As aresult, the <quiet, quiet> outcome is inherently unstable. Ultimately, the players finish in the inferior (but sustainable) <confess, confess> outcome. 1.1.2: The Meaning of the Numbers and the Role of Game TheoryAlthough a large branch of game theory is devoted to the study of expectedutility, we generally consider each player’s payoffs as a ranking of his mostpreferred outcome to his least preferred outcome. In the prisoner’s dilemma, weassumed that players only wanted to minimize their jail time. Game theory doesnot force players to have these preferences, as critics frequently claim. Instead,game theory analyzes what should happen given what players desire. So ifplayers only want to minimize jail time, we could use the negative number ofmonths spent in jail as their payoffs. This preserves their individual orderingsover outcomes, as the most preferred outcome is worth 0, the least preferredoutcome is -12, and everything else logically follows in between.Interestingly, the cardinal values of the numbers are irrelevant to theoutcome of the prisoner’s dilemma. For example, suppose we changed thepayoff matrix to this:Here, we have replaced the months of jail time with an ordering of most toleast preferred outcomes, with 4 representing a player’s most preferred outcomeand 1 representing a player’s least preferred outcome. In other words, player 1would most like to reach the <confess, quiet> outcome, then the <quiet, quiet>outcome, then the <confess, confess> outcome, then the <quiet, confess>outcome.Even with these changes, confess is still always better than keep quiet. Tosee this, suppose player 2 kept quiet:Player 1 should confess, since 4 beats 3.Likewise, suppose player 2 confessed:Then player 1 should still confess, as 2 beats 1.The same is true for player 2. First, suppose player 1 kept quiet: Player 2 ought to confess, since 4 beats 3.Alternatively, if player 1 confessed:Player 2 should confess as well, as 2 is greater than 1. Thus, regardless ofwhat the other player does, each player’s best strategy is to confess.To be clear, this preference ordering exclusively over time spent in jail isjust one way the players may interpret the situation. Suppose you and a friendwere actually arrested and the interrogator offered you a similar deal. The resultshere do not generally tell you what to do in that situation, unless you and yourfriend only cared about jail time. Perhaps your friendship is strong, and both ofyou value it more than avoiding jail time. Since confessing might destroy thefriendship, you could prefer to keep quiet if your partner kept quiet, whichchanges the ranking of your outcomes. Your preferences here are perfectlyrational. However, we do not yet have the tools to solve the corresponding game.We will reconsider these alternative sets of preferences in Lesson 1.3.Indeed, the possibility of alternative preferences highlights game theory’srole in making predictions about the world. In general, we take a three stepapproach: 1) Make assumptions.2) Do some math.3) Draw conclusions.We do steps 1 and 3 everyday. However, absent rigorous logic, someconclusions we draw may not actually follow from our assumptions. Gametheory—the math from step 2 that this book covers—provides a rigorous way ofensuring that that our conclusions follow directly from the assumptions. Thus,correct assumptions imply correct conclusions. But incorrect assumptions couldlead to ridiculous claims. As such, we must be careful (and precise!) about theassumptions we make, and we should not be surprised if our conclusions changebased on the assumptions we make.Nevertheless, for the given payoffs in the prisoner’s dilemma, we have seenan example of strict dominance. We say that a strategy x strictly dominatesstrategy y for a player if strategy x provides a greater payoff for that player thanstrategy y regardless of what the other players do. In this example, confessingstrictly dominated keeping quiet for both players. Unsurprisingly, players neveroptimally select strictly dominated strategies—by definition, a better optionalways exists regardless of what the other players do. 1.1.3: Applications of the Prisoner’s DilemmaThe prisoner’s dilemma has a number of applications. Let’s use the game toexplore optimal strategies in a number of different contexts.First, consider two states considering whether to go to war. The militarytechnology available to these countries gives the side that strikes first a largeadvantage in the fighting. In fact, the first-strike benefit is so great that eachcountry would prefer attacking the other state even if its rival plays a peacefulstrategy. However, because war destroys property and kills people, both preferremaining at peace to simultaneously declaring war.Using these preferences, we can draw up the following matrix:From this, we can see that the states most prefer attacking while the otherone plays defensively. (This is due to the first-strike advantage.) Their next bestoutcome is to maintain the peace through mutual defensive strategies. After that,they prefer declaring war simultaneously. Each state’s worst outcome is tochoose defense while the other side acts as the aggressor.We do not need to solve this game—we already have! This is the same gamefrom the previous section, except we have exchanged the labels “quiet” with“defend” and “confess” with “attack.” Thus, we know that both states attack inthis situation even though they both prefer the <defend, defend> outcome. Thefirst-strike advantages trap the states in a prisoner’s dilemma that leads to war.A similar problem exists with arms races. Imagine states mustsimultaneously choose whether to develop a new military technology.Constructing weapons is expensive but provides greater security against rivalstates. We can draw up another matrix for this scenario:Here, the states most prefer building while the other state passes. Followingthat, they prefer the <pass, pass> outcome to the <build, build> outcome; thestates maintain the same relative military strength in both these outcomes, butthey do not waste money on weaponry if they both pass. The worst possibleoutcome is for the other side to build while the original side passes. Again, wealready know the solution to this game. Both sides engage in the arms race andbuild.