Week 3: Rationality, Knowledge, and Evolution in Games
Week 3: Rationality, Knowledge, and Evolution in Games
“Digression: The Card Game Revisited …Digression: How You Played the Card Game and Addressing the Concerns About Game Theory…Payoffs” in a Game: What Exactly Are Those Numbers?…What Does it Mean That a Player is Rational?…Domination: Strategies That Are Obviously Good or Bad…Common Knowledge of Rationality…Low rationality: What Happens if Players Are Not Very Smart?…Game Theory Under Zero-Intelligence: Biological Evolution…Game Theory Under Zero-Intelligence: Biological Evolution…Fig Wasps Play a Nash Equilibrium…”
3-2 Digression: How You Played the Card Game and Addressing the Concerns About Game Theory
3-3 Payoffs” in a Game: What Exactly Are Those Numbers?
3-4 What Does it Mean That a Player is Rational?
3-5 Domination: Strategies That Are Obviously Good or Bad
3-6 Common Knowledge of Rationality
3-7 Low rationality: What Happens if Players Are Not Very Smart?
3-8 Game Theory Under Zero-Intelligence: Biological Evolution
3-9 Fig Wasps Play a Nash Equilibrium
3-1 Digression: The Card Game Revisited
Okay, so I’m going to consider the whole spectrum of intellectual capacity of the players.
What happens if players absolutely have no intelligence? Still game theory can predict individuals’ behavior.
So before giving the series of lectures about rationality and people’s behavior, let’s try to examine the outcome of the games I asked you to play in the first week.
So in the first week, I asked you to play a simple card game with your friends.
Each player carefully chooses, chooses one card and show it simultaneously to their opponent.
If players choose different number the card like 1 and 3 also red wins.
Red player wins if both choose K or players choose cards with different numbers like 1 this week.
So black player wins if only one player chooses king or players choose cards with the same number such as 1 and 1.
Since the rule, rules are not symmetric, so maybe one of the players may have an advantage.
Who has as advantage, red or black? Second question, in particular, what is the winning rate of each player? What’s the probability of winning of red and black? Second question.
The third question, can you say anything about the distribution of cards of each player? Those three questions are hard questions.
Probably the first question, which player is stronger, you can get some idea by inspecting the nature of the game.
The second and third question, calculating the winning rate of each player and calculating the distribution of cards of each player, it’s, it’s very hard, hard question.
So let’s find a mixed strategy equilibrium where each player mixes those four cards with certain probabilities.
Okay, so the rules say red player wins if both player choose K, or different numbers are chosen.
So if both player, choose K, red player wins, and his payoff is 1, and black player’s payoff is 0.
Red player wins if they choose different numbers such as 1 and 2 and 1 and 3.
Also red player wins and obtains payoff 1, and black player loses, his payoff is 0.
So your task is to determine the probability distribution over black player’s strategies, K, 1, 2, and 3.
So let’s suppose black player chooses 1 with probability p, 2 with probability q, 3 with probability r. With the remaining probability, 1 minus p minus q minus r, the black player chooses K. Your task is to determine those three numbers.
Okay, but since those three number, cards 1, 2, 3 have a very similar role, let’s guess that black player chooses each number card with an equal probability of k, of p, I’m sorry.
What if you choose 1 as the red player? So you win if your opponent chooses different number cards 2 and 3.
The equilibrium red player should be mixing all those cards, king, 1, 2, 3.
Since in the equilibrium probability red player is mixing all those cards, K, 1, 2, 3, those numbers here should be identical, okay? They are equally good.
Red player is mixing between K, 1, 2, and 3.
So the winning rate of red player is always 0.4.
Okay? So we have found those numbers, 0.2 for black players.
Okay, so with the remaining probability, the black player chooses king.
This is the equilibrium distribution of cards by a black player.
We have calculated the winning rate of red player, and that was 0.4.
Okay, so you can perform a similar exercise to determine red player’s probability distribution of K, 1, 2, and 3.
Winning rate of red player is 0.4 and, of course, the winning rate of black is the remaining probability, 0.6.
What about the distribution of cards? Well, king is played most often with probability 0.4.
3-2 Digression: How You Played the Card Game and Addressing the Concerns About Game Theory
In the last lecture we calculated the mixed strategy Nash equilibrium in the card game we played in the first week.
Okay, now I’m ready to explain the results of the card game you played in the first week.
Okay, so I’m going to show you a dataset and players in this dataset where you and your friends, and actually a large number of people 670 pairs, are, were participated in card game as of 11th of February in 2015.
Actually in this game, black player is stronger than red player, and more precisely, the winning rate of black player is 0.6, and the winning rate of led, red player is 0.4, okay? So now I’d like to ask you to think about the accuracy of this game theoretic prediction, so those are the theory number, and on the other hand I have real data winning rate calculated from the play of the game in the first week, and I’m going to give you three possibilities about the difference between the actual data and the equilibrium prediction, number one, number two, and the number three.
You should, you should think which one fits to your expectation about the accuracy of this game’s theoretic prediction, so the first possibility so let’s talk about red player.
The winning rate of red player should be 0.4 according to game theoretic prediction, okay, so the possibility number one says game theoretic prediction works so-so.
Possibility number two the game theoretic prediction works reasonably well, so the actual winning rate of red player is in the range of 0.35 and 0.45.
Okay, the last possibility, number three says that the game theoretic prediction works reas, amazingly well, so the actual winning rate of red player is in the range of 0.39 and 0.4.
Okay, number one, number two, number three, so please think which one fits to your expectation about the accuracy of game theoretic prediction.
So the game theoretic prediction worked amazing, amazingly well in terms of the winning rates.
Okay, again, game theoretic prediction worked amazingly well but I noticed that in this particular dataset number one was played with a larger probability than the equilibrium prediction, 0.24 compared to point 0.20, which is an equilibrium prediction.
In the video game instruction card number one was phrased as ace, and usually, ace is a pretty strong card in many card games, so maybe you are inclined to play this strong card, ace, but otherwise the prediction worked pretty well.
I know the result of Barry O’Neill’s original experiment in 1987, and in the past ten years, I have been conducting this card game with my undergraduate students in my game theory class at the University of Tokyo, and I also had a few occasions which I, I let high school students play this game, ‘kay, so this is my dataset, and let me show you what I have found, okay, about the winning red.
Okay so the first card game was conducted in 2004 in my class, and again, the outcome was amazingly close to the Nash equilibrium prediction, and I ran similar experiments again and again and every time, you know, the outcome was amazingly close to Nash equilibrium prediction in terms of winning rates.
Okay, so mathematical formula has been proven to be useful to predict natural phenomenon like a falling ball, but now game theory is trying to apply mathematical formula to predict people’s behavior, and there are a few common and valid concerns about predicting people’s behavior by a mathematical model, okay? So I explained those concerns in the first week, and there were three concerns.
So concern number one about using mathematics to predict the people’s behavior says that we have free will, and free will defeats any attempt to predict human behavior by a mathematical model or a mathematical formula, well and second concern, okay the subject of game theory of humans, okay? And humans take certain behavior because they have some intentions, so ultimately, you can use, always ask why did you do that, okay, and then you can find out the reason.
We can just collect facts, and we can just conduct interviews to find out what was happening, and indeed, this was the way we conducted social science research before the invention of game theory.
We don’t need any mathematical model in social sciences, and the third concern says I’ve never heard that game theory works, ‘kay? So I’m going to address all those valid concerns about game theory by means of this card game.
Well, I guess lots of you have hard time in articulating why you did it, okay, so you used your instinct or intuition to play this game and outcome was amazingly close to Nash equilibrium, okay.
So game theory can uncover the mach, mechanism operating behind your instinctive behavior, so sometimes you take some action out of intuition or out of some reason, but oftentimes, you, you have hard time in articulating why you did it, and some other mechanisms, some mechanism may be operating behind your behavior.
Game theory eh is a very important tool to find out that mechanism operating behind people’s behavior, okay? So just asking people how, why you do this? In this particular card game, this kind of research program doesn’t really work so maybe you can find out that the distribution is uncertain, you know, the distribution of the card is this way but by just conducting interview with each individual you can never find out why people just does this distribution.
3-3 “Payoffs” in a Game: What Exactly Are Those Numbers?
I’m ready to examine the relationship between player’s intelligence and their behavior.
And in games where we play as trying to maximize their payoff and the payoffs as often given by a payoff table and you see those numbers, three, two and zero.
I’m going to explain exactly what those numbers mean.
That’s the starting point in examining the relationship between intelligence of players and the outcome.
So what are the payoffs in a game, those numbers three, two and zero.
What those numbers replacing is not so clear? So let me give you precise explanation what those numbers are or what those number should be.
So when there is no uncertainty, it’s very easy to specify player’s payoff.
You can just assign larger numbers for better outcomes.
Best outcome, worst outcome and worst outcome for you.
Okay? And you have, your task is to assign payoff to this outcome, payoff to the middle outcome and payoff to the worst outcome.
Well, you can assign larger number for better outcome.
Okay? Larger numbers for better outcome, so this is one possibility.
Any assignment of numbers is fine, as long as you assign larger numbers for better outcomes.
Why? So no matter which presentation or which choice you choose, there let’s suppose 20, 19 and 0 that is one way to assign larger numbers for a better outcome.
As long as this number here is larger than this number and as long as this number is larger than this number, payoff maximization means choosing better outcome.
So to determine players payoff in a situation without any uncertainty, you can just assign larger numbers to better outcomes for the player.
A player’s may dislike risk and you have to somehow represent those two differential situations.
So let’s suppose you have a randomness, random events in, in a game and suppose a player is facing a choice between a lottery and some other outcome.
If a player chooses a lottery with probability one-half, he gets $10. And with the other probability one-half, he gets zero.
The outcome of the lottery is here.
So risk-averse players choose the the second choice.
Okay? First, you assume that players get some utility from outcome.
So if you choose this lottery, outcome will be either $10 or $0. So this, you get some utility out of $10. Okay? And if outcome was $0, you get some utility of $0. Okay.
Okay? And then after assigning those numbers, utilities, you should assume that players maximizing expected utility.
If player chooses this lottery half of the time, he gets utility from $10 dollars in lieu of ten.
So he gets satisfaction or utility from $5 for sure, okay? So if number here is smaller, then the average of satisfaction ten and zero.
So if $5 for sure have small utility, then you choose lottery, because lottery has large expected utility.
Okay? So when uncertainty is presented in game situation, a player’s payoff is equal to utility coming from the outcome and you assume that players are maximizing expected utility.
By specify, specifying those numbers, utility is coming from outcomes, you can describe if a player likes or dislike the risk.
3-4 What Does it Mean That a Player is Rational?
What is the relationship between players rationality and the outcome of a game? Okay, to answer this question, you have to make it clear what it means that a player is rational.
The same term, rationality, can mean different things, okay? So, I’m going to present a clear-cut definition of rationality that is widely adopted by game theorists and economists.
What does it mean that Player A is rational when he is playing roulette game? What does it mean that he is rational in a horse race? And what does it mean that he is rational in a, in a poker game, okay? So all of those games look fairly similar, roulette and horse race and poker.
So what does it mean that Mr. A is rational in roulette game? Well, the behavior of roulette is objectively given.
That’s what I mean by rational behavior in roulette game.
So roulette game is very simple in terms of rational behavior.
If you know horse one is very likely to win or horse two is very unlikely to win, okay, so he can summarize his assessment by means of subjective probability.
Okay? So the first step for the rational decision-making in horse race is to assign subjective probabilities on various outcomes and the rest is similar.
Okay so the definition, or formal definition of rationality, there could be many different definitions, but this is the definition which is widely adopted by game theorists.
The definition says the following, a player or an agent is rational if he or she is aware of all possible events.
So let’s try to apply this definition to human behavior, a game of poker between human and human.
Okay, what does what does it mean that Mr. A is rational? Say Andy and Becky, Andy’s rationality means, well Becky’s behavior is not given by objective probabilities.
He should assign subjective probabilities about various behavior by Becky, okay? And then A, based on his belief, subjective probabilities, he maximizes his expected utility.
Okay, so since Becky is also rational, she is forming, she’s assigning some subjective probability over Andy’s possible behavior.
So to analyze rational behavior in human interaction, you need to think about Andy’s belief, about Becky’s belief, about Andy’s belief about Becky’s belief and for a, and so on and so forth.
Okay, applying the definition of rationality to human interaction leads to the infinite regress problem.
3-5 Domination: Strategies That Are Obviously Good or Bad
To examine the relationship between players’ rationality and their behavior in a game, it’s very important to define, obviously good strategies and obviously bad strategies, okay? And such things can be formulated by a concept called domination.
Suppose your opponent is going to cooperate, okay? Which is better, cooperation or defection, for you, as player one? Okay if you switch from cooperation to defection, supposing that the opponent, player two, is cooperating.
Okay, so what happens if your opponent is going to defect? So again, if you switch from cooperation to defection, your payoff increases.
The payoff increases from minus 15 to minus 10, okay? So defection is always the best strategy for you in Prisoner’s dilemma game.
Independent of what your opponent is going to do, okay? So if this is true, we say that defection dominates cooperation, okay? So defection is obviously a good strategy.
Okay? So let me just give you the definition of domination by means of a simple picture.
Okay, the blue line is above the green line, so that means no matter what your opponents are going to do, a is strictly better than b. If this is true, we say that strategy a, the blue strategy, strictly dominates the green strategy, strategy b. Is it clear? Okay, sometimes you may get this picture.
Okay, one of your simplification which is going to be repeatedly, you know, adopted, in the following lectures is the following.
So b is strictly dominated by a and the implication of rationality is a rational player never chooses strictly dominated strategy, okay? So maybe you have lots of strategies and maybe you have even better strategies than a. But obviously b is a bad choice, rather than choosing b you could always choose a and to get better payoff no matter what other players are going to get.
Okay, so let me just give you an example of a strictly dominated strategy.
So you have two ice cream stands, a and b. And they are simultaneously choosing their location on a street, okay? But to make my point I’m going to assume that there are finitely many slots.
For possible locations of ice cream vendors in b. Okay so I’m going to change the game a little bit and I assume that there are finitely many possible locations.
Okay, I’m going to assume that A and B can occupy exactly the same slot, because the street has two sides.
The player B can have its ice cream stand on the other side, okay? So A and B can occupy the same slot, that’s a possibility.
Again, customers are uniformly distributed over this street and the customer goes to the closest ice cream stand- And if two ice cream stands are located at exactly the same, same slot they split customers equally, okay? And payoff to each ice cream vendor is the number of customers, okay? This is a location game with finitely many possible locations.
Okay, so I’m going to show that location one strictly dominates location zero.
Okay, so relative desirability of zero and one depends on where your opponent b is located.
So if you choose A, if you choose zero as the, as the allocation, it’s a tied situation, and half of the customer is coming to you, vendor A. Okay, but if you switch from zero to one, almost all customers are coming to you.
Okay, so given that B is at the end point of zero, one is greater than A. One is better than zero for player A. Okay, another possibility.
3-6 Common Knowledge of Rationality
Now, I’m going to examine what happens if players mutually know that they are rational, okay? And their surprising outcome comes out in the location game.
They may not know that other players are rational.
Situation 2, players are rational, and moreover, players know that they are rational.
So I know that you are rational, and you know that I am rational.
Let’s consider the third situation, where not only players are rational and player know that they are rational.
So what is the difference between 2 and 3? Intuitively, you know, 2 and 3 doesn’t make any difference.
Players know that they know, that they know that they are rational, okay? Oh the question I would like pose is the following.
I’m going to show that just by using rational calculation or rational reasoning, I’m going to show you that they must play Nash equilibrium.
So now let’s suppose the other player, Vicky, is also rational.
Let’s suppose she knows that Andy is rational.
Okay, so if Andy is rational, he tends to avoid the end points 0, 100.
Okay? So now suppose that both players are rational, and now suppose that Becky knows that Andy is rational.
So since Becky knows Andy is rational, she knows that Andy never chooses 0 or 100.
Okay, so Becky knows that Andy is rational and therefore Andy won’t choose 0 or 100.
Okay? So after excluding the possibility of 0 and 100 by Andy because Andy is rational, 2 dominates 1, okay? So the conclusion is, you know, the next end point 1 or 99 is obviously bad choice for Becky given that Andy never chooses end points 0 and 100.
If player B knows that A is rational, he can, she can exclude the second end points 1 and 99.
Okay, so next let examine what happens if Andy knows that Vicky knows that Andy is rational.
Both players know that the other player is rational so they won’t chose second end points, 1 or 99.
Now Andy further knows that Vicky knows that Andy is rational.
So the conclusion is well, since Vicky knows that I am rational, she won’t choose those two end points here and two end points here.
Okay, so what’s the conclusion? Well, if players know that they know that they know that they know that they are rational.
If the word know appears many times here, and then you can exclude those bad choices one by one.
Okay, so let’s examine how many times the word know should appear to reach the conclusion that middle point is just.
Okay, so if a player Andy is rational he never chooses location 0.
If he knows that Becky is rational if this word know appears once, then you can exclude point 1.
Okay, and since you have say, from 1 to 49, since you have 49 points to be excluded, this word know should appear 49 times.
So if players know that they know that they know that they know, and that this word know appears 49 times, the conclusion is they must play Nash equilibrium.
Okay, if this statement is true, if players know that they know that they know that they are rational and this word know appears many times, and then they should play the Nash equilibrium.
Players play in Nash, players play Nash equilibrium in the location game.
A situation where players know that they know that they are know that they are rational.
This word, know, can appear in any number of times.
3-7 Low rationality: What Happens if Players Are Not Very Smart?
So I have been, we have been examining the relationship between players’ intelligence and the outcome of, of the game.
The previous lecture showed that if players are hyper rational, okay? If they have unlimited ability of conducting extremely sophisticated reasoning, and if players’ rationality are completely understood, then sometimes they can play Nash equilibrium out of rational calculation.
Okay? People may not be so rational, and people may not have an ability of conducting sophisticated reasoning, but people are not completely stupid.
What happens if players are not so, not very smart? That’s a question we are going to examine.
Okay, well what happens if players are not very smart? Well, outcome may be chaotic.
Okay, so even if at the beginning people are confused and their behavior is chaotic and maybe they commit similar mistakes, over time, players may find better ways to play a game.
Okay? So over time, player, players may find better ways to play a game, okay? By accumulating experience in the same or a, or a similar game or by observing how others play in the same game or a similar game.
Okay? So what you can do is to try various actions and to see which one is better for you.
In the last lecture, I showed that extremely sophisticated reasoning by a hyper-rational player in the location game with finitely many locations, eventually they reach, reach the conclusion that they should play Nash equilibrium.
Okay, so I showed that endpoints 0 or 100 are obviously bad choice.
You know, Andy, player A, may not be so rational to find out that 0 or 100 are obviously bad locations.
Okay, eventually he finds out endpoints are bad locations.
Okay, so again, she is going to try to find the locations here and there.
Okay? So eventually she starts avoiding choosing those four endpoints.
Eventually players learn to play a Nash equilibrium.
Okay, so by means of this trial and error adjustment process, A and B are likely to choose eventually the Nash equilibrium point, the middle point.
In some games like location games, repeated elimination of obviously bad strategies, repeated elimination of strictly dominated strategies, leads to a Nash equilibrium.
Okay? So if repeated elimination of strictly dominated strategy leads to Nash equilibrium, Nash equilibrium is expected to emerge under a wide range of intellectual capacities of players.
3-8 Game Theory Under Zero-Intelligence: Biological Evolution
Still you can apply the concept of game theory to get fruitful prediction.
Okay? So we have been examining the whole spectrum of possible intelligence, okay? We have been examining whole spectrum of possible intellectual capacities of players.
I’m going to show that game theory is useful to predict their behavior.
In the last lecture I showed even if humans are not so rational, over time humans may find better ways to play the game.
Okay? In human behavior, they switch, human players can switch from one strategy to a better strategy.
Okay? So if you have successful strategy and not so successful strategy, successful strategy eventually survives.
Okay, so you can apply the idea of game theory to biological evolution.
The study of evolution by means of game theory is called evolutionary game theory.
Evolutionary game theory formalizes players, or, or you know, animal and plant and insects behavior, in terms of evolution.
Then you have to specify the game that is plagued by insects or animals and plants.
Your task is to specify strategies in biological evolution and game.
So what the strategies in evolutionary game theory? Well maybe a strategy represents physical characteristic of an individual.
Or maybe a strategy is a behavior, like aggression or defense.
Some of those characteristics could be determined by gene, and Game Theory, Evolutionary Game Theory concerns physical characteristics, or behavior directly programmed by genes.
Okay, so strategies in Evolutionary Game Theory is genetically programmed characteristics of an individual such as body size or behavior.
Strategy means a genetically programmed characteristics and it’s determined by gene.
A strategy can be identified with a gene.
Player can be represent or, or a player is equal to gene in evolutionary game theory.
So what are payoff of a player in evolutionary game? Okay.
Also gene is, there is one-to-one correspondence between gene and strategy.
So what is the payoff to a player or a payoff to a strategy? Evolutionary game theory says that it’s the number of offspring.
Okay? So in evolutionary game theory payoff is equal to the number of offspring and sometimes it’s called evolutionary fitness.
So let’s examine the situation where there are two, three strategies initially.
Strategy A, B, and C. Or gene A, B, and C, okay? Three genes are present originally.
What is the payoff to, say, a strategy A or gene A. Well, the payoff to A depends on what others do, okay? So the number of offspring of A may depend on the distribution of the strategy in the society.
Okay so A’s payoff number of green is determined by the distribution of numbers on the strategies in the society.
So for the sake of argument let’s assume that strategy A has the largest number of offspring’s.
As long as strategies A, B, C are present in the society.
Originally, you had three strategies, A, B, and C. But maybe there could be lots of other strategies.
Other strategies may be introduced in this society by means of mutations.
So let’s suppose mutation happens in this society and a new strategy, D, comes in.
So no strategy can invade the society by means of mutation.
In other words, no profitable deviation present is present to other strategies.
The outcome of biological evolution is Nash Equilibrium of a game played by genes.
3-9 Fig Wasps Play a Nash Equilibrium
The other type has wings, okay? So, male fig wasps have two possible strategies determined by genes.
Staying strategies, because males without wings stay in the fig.
A certain fraction of female fly out very quickly, okay? And the relative frequency of those females flying out quickly.
Lots of female flying out or lots of female staying.
What kind of evolutionary forces operate on the selection of bitter strategy for the male.
Staying gene produces staying males, and the flying genes, you know provides males with wings.
Let’s suppose that the flying males are rare in the original situation.
A certain fraction of female go out of fig quickly, and certain fraction of female are staying.
If there are lots of staying males, there is a fierce competition to win females attention here, okay? But there, okay.
So staying males have a very small chance of mating with a female.
A fig is overcrowded if there are lots of staying males.
On the other hand, for our flying male, well, this guy is surrounded with, surrounded by lots of beautiful ladies.
So when fraction of flying male is small, then flying males have better chance of getting lots of offspring’s.
So on horizontal access, I’m going to measure the fraction of a flying males.
So it’s in between zero and one and I just explained what happened if the fraction of a flying male is small.
Okay? So vertical axis measures fitness of each strategy, flying and staying, against the number of offspring’s.
So as I explained, when the fraction of flying males is small, then flying males have a better chance of getting many offspring’s.
Okay? Where flying males are rare, and flying males have a higher fitness than staying males.
The fraction of flying male is going to increase.
Okay, so what happens if there are lots flying males.
If there are lots of flying males there’s a fierce competition here and those flying males have, have a very low chance of getting a females attention.
This guy here it’s like heaven you know? And this staying male have a very much higher chance of, of, getting offspring’s.
When fraction of flying male is large flying males have less offspring.
When the fraction is high fitness of flying male is small and fitness of staying male is larger.
The fraction of flying male is going to increase, is going to decrease.
Okay? So evolutionary force go, you know, by means of evolutionary selection, fraction of flying male decreases here.
Evolutionary force leads to a point where both males, flying males and staying males have an equal chance of getting offspring’s.
So at this stationary point, no male can increase its payoff by switching to the other strategy.
It’s a kind of Nash equilibrium, and it should be the outcome of biological evolution for the fig, fig wasp males.
Okay, so this story explains why you have two males.
There could be other possible explanations why they are two different kind of males.
Well, if you examine this situation carefully, Nash equilibrium provides you with the precise prediction about the fraction of flying male.
Okay so we first considered the situation when fraction of flying male is small.
So in this situation flying male is surrounded by lots of females.
Okay? So now flying males are going to have lots of offspring’s and the fraction of flying males and staying males are going to change.
A certain fraction of females are flying out, say three out of five.
Also a fraction of flying male, three out of five, they are equal.
Fraction of female quickly and flying out of fig should be equal to the fraction of flying male.
Here we have fraction of flying male, and here we measure the fraction of female quickly flying out from fig.