This is intended as a three-part sequence. Part two will go over my strategy. Part three will reveal the results and discuss some implications.

In the same class in which we later played The Darwin Game, we played a less complex game called Simplified Poker. As in The Darwin Game, we were given the rules and asked to submit instructions for a computer program that would play the game, and the professor would then code our programs for us.

The rules of Simplified Poker are as follows:

Game is played with a 3-card deck, with the cards labeled 1, 2 and 3.

Each hand, the players alternate who goes first, each player antes one chip and is dealt one card.

The first player can bet one chip, or check.

If the first player bets, the second player can either call the one chip bet, or fold.

If the first player checks, the second player can either also check, or can bet. If the second player bets, the first player can either call the one chip bet, or fold.

There is at most one bet per hand, as neither player is allowed to raise.

If either player folds, the other wins the pot of 2 chips and takes back their 1 chip bet. Neither card is shown. If neither player folds – either both players check, or there is a bet and a call – then both cards are revealed and the player with the higher card takes all chips.

In the class, all programs would play a round robin with all other programs, with 50 hands per match. You know the results of previous hands during the match, but not the results of hands from other matches. Your goal is to maximize the average number of chips won over all rounds – note that how many opponents you beat does not matter, only the number of chips won.

The game is simple. A lot, but far from all, of your decisions are forced. There’s no weird trick, but optimal play still isn’t obvious. I’ll pause here to allow and encourage thinking about what strategy you’d submit.

(Edited to explicitly note that you have knowledge of earlier hands in the same match)

Next: Simplified Poker Strategy

### Like this:

Like Loading...

If you’re first, there’s one binary decision to make, plus if you check and the other person bets, you have another binary decision to make. If you’re second, there are two possible binary decisions you have to make. There are three possible cards. For each card, there are 4 possible strategies for being second, and 3 possible strategies for being first (bet, check+bet, check+fold) for a total of 12 possibilities.

So, the total number of strategies is 12^3=1728. Should be easy and quick enough to code them all up and run a round robin among all of them for 50 rounds (150 million games played).

Can probably separate first-player strategy from second-player and cut down number of games needing simulation but easy enough to do all of them.

unless we’re able to take previous behavior into account? What exactly is the input into the program?

Yes, you know the outcome of previous rounds, if that was unclear.

Also, being good against random decisions is pretty different from good against the field you expect…

Going to simulate what happens when taking out strictly dominated strategies, although it surprised me that several of those managed to win in a round-robin of all strategies.

Process took 7 minutes to run.

Top strategy, with a total of of 1342 chips at the end (or about 0.015 chip gain per game played) is:

If card is 1: check if you’re first, bet if you’re second and the other guy checked, fold if they bet

If card is 2: bet if first, fold if second and other guy bets, bet if other guy checks.

If card is 3: check if first, call if they bet, fold if second and other guy bets (!!!), bet if other guy checks.

Five out of the top ten strategies did the “fold if other guy bets and you have 3” thing, which seems like it should be dominated by not doing that. My conclusion is over 150 million games (or 86400 per strategy as each player) luck is a far more significant factor than skill.

Going to run several times and see if the same ones win.

Here’s my actual game code. Strategies are encoded as 12-digit strings of 1 and 0, 1 is bet/call, 0 is fold, 4 digits per card, first 2 of each set of 4 digits are for first player’s two decisions, second 2 of each set are for second player if first player bets/checks.

def run_game(i,j):

a,b=0,0

for game in range(0,50):

acard=random.randint(1,3)

bcard=random.randint(1,3)

while (acard==bcard):

bcard=random.randint(1,3)

if i[acard*4-4]==1: #first player bets

if j[bcard*4-2]==1: #second player calls bet

if acard>bcard: #a wins 2, b loses 2

a+=2

b-=2

continue

if bcard>acard: #b wins 2, a loses 2

b+=2

a-=2

continue

else: #second player folds, a wins 1, b loses 1

a+=1

b-=1

continue

else: #first player checks

if j[bcard*4-1]==1: #second player bets

if i[acard*4-3]==1: # first player calls

if acard>bcard: #a wins 2, b loses 2

a+=2

b-=2

continue

if bcard>acard: #b wins 2, a loses 2

b+=2

a-=2

continue

else: #first player folds, b wins 1, a loses 1

b+=1

a-=1

else: #second player checks

if acard>bcard: #a wins 1, b loses 1

a+=1

b-=1

continue

if bcard>acard: #b wins 1, a loses 1

b+=1

a-=1

continue

return (a,b)

Sounds like you have a bug somewhere. If you’re folding 3s, it shouldn’t take long to get punished for that pretty bad.

1. Could you post the code somewhere where whitespace is kept? e.g. pastebin, github, etc. Unindented python code is pretty garbage to try and understand.

2. There’s no reason to randomise the draws in comparing these strategies. Just loop through all the permutations of drawn cards and going first/second, end result will be same as doing big # of random deals.

3. Knowing history & some abusable bots going into the pool changes strategy immensely, but I’m still interested in the proper no-history optimal play.

Yup – was comparing the string ‘1’ with the number 1 which made everyone check all the time.

running correct code now:

Top two results have 81064 and 81753 points, after that best is 72493 so I declare those 2 the optima:

100110110111

100110111011

If 1: first player, bet, second player fold/bet depending on if first bet.

If 2: first player, bet, second player call/bet

If 3: first player: one strategy bets, other one checks, but calls if second bets. second player: call/bet.

To sum up: always bet if first (that strategy had a small edge over the other). Always bet if second unless you have a 1 and the other guy bet. Can also be summarized as “always call/bet unless betting is dominated by not betting” (which only happens if they bet and you know you will lose, so no bluffing, which only happens if you have 1 and they already bet).

Another run had the same top 2 in the same order.

Now that we’ve solved the static problem, what strategies can improve on this with the ability to look at prior hands? Note that each strategy played 86400 games as each player, so the top ones are earning about 0.47 chips per game.

I guess the next step is taking out strategies that are strictly dominated and seeing how much of an advantage this strategy gives against whatever’s left.

@h6

https://pastebin.com/cdb65iX2 click download to get it with correct syntax.

Yeah I know it’s not optimal, but it runs in 6-7 minutes and is now correct after fixing the bug, good enough.

Note that if you’re playing against this strategy, you can infer nothing about their cards because they always bet.

I did a special tournament where every strategy just played against this one 100000 times.

A couple strategies beat it and did so consistently.

One such strategy: 000101100111

On 1, only bets as player 2 if 1 doesn’t bet

On 2, checks and calls as player 1. Calls or checks as player 2.

On 3, checks and calls as player 1. Call/bet as player 2.

The possible games between these two (optimal strategy of always betting except with 1 after the other bet), calling that strategy a and the one in this post b:

a first:

1,2: b wins 2

1,3: b wins 2

2,3: b wins 2

2,1: b folds, a wins 1

3,1: b folds, a wins 1

3,2: a wins 2

b first:

1,2: b folds, a wins 1

1,3: b folds, a wins 1

2,3: a wins 2

2,1: b wins 2

3,1: b wins 2

3,2: b wins 2

total: b wins 12, a wins 8, they each also lose that so total is b is ahead 4 and a is behind 4 after 12 games, or b gains 0.33 chips per game.

My strategy would be:

As player one, check all hands, then call with 3’s and 1/3 of my 2’s.

As player two, bet all 3’s and 1/3 of my 1’s.

Then hope other players don’t play optimally.

If you want an NE strategy you have to determine 12 numbers:

Praise_going_first(n), Pcheck_going_first(n), Pcall(n), Praise_vs_check(n) where n=1, 2 or 3

Obviously Pcall(3) = 1. I would guess Pcall(1) = 0 and Pcall(2) = 0.5 but I am not sure on the later two numbers. I will think about how to get the NE values for the other eleven numbers.

You probably dont want a NE strategy but I would want to know what the NE is if I could.

I would guess if you have the six numbers for ‘going first’ or ‘going second’ its easy to get the other six. Since one of the numbers for player2 is set its probably easier to assume you are going second.

The only forced moves seem to be never calling 1, always calling 3, and, because of those, never bet on a 2 (since they’ll only call if they’ll win).

Will exclude those from my strategies and re-run.

Also, if you’re second with a 3, you should bet.

if you’re first with 3, it’s not forced – betting will possibly pick up a 2, checking will possibly get a 1 to bet.

total of 32 pure strategies: 4 for card 1, 4 for 2, and only 2 for card 3.

Changed code to just play through all 6 actual possible games for each ordered pair, results:

[‘100000000111’, -32]

[‘100000001011’, -32]

[‘100100000111’, -32]

[‘100100001011’, -32]

[‘000000000111’, -16]

[‘000000001011’, -16]

[‘000100000111’, -16]

[‘000100001011’, -16]

[‘100001000111’, -16]

[‘100001001011’, -16]

[‘100101000111’, -16]

[‘100101001011’, -16]

[‘000001000111’, 0]

[‘000001001011’, 0]

[‘000101000111’, 0]

[‘000101001011’, 0]

[‘100000100111’, 0]

[‘100000101011’, 0]

[‘100100100111’, 0]

[‘100100101011’, 0]

[‘000000100111’, 16]

[‘000000101011’, 16]

[‘000100100111’, 16]

[‘000100101011’, 16]

[‘100001100111’, 16]

[‘100001101011’, 16]

[‘100101100111’, 16]

[‘100101101011’, 16]

[‘000001100111’, 32]

[‘000001101011’, 32]

[‘000101100111’, 32]

[‘000101101011’, 32]

Top 4 winning strategies:

for 1, possibly bet if they checked, never bet as first

for 2, check/call if first, call/check if second

for 3, either bet or check/call as first, bet as second

I guess mixed strategies should probably be based on the top 4 here.

Interestingly, even though all strategies here bet as player 2 with a 3 but not all bet with a 1, the winning strategies all call as player 1 with a 2. You must gain more from calling against 1s than you lose against 3s. If the probability you’re against a 3 is X:

Payoff if you call: X*-2 + (1-X) * 2 = (1-2X)*2

Payoff if you fold: -1

So the value of X that you should be indifferent to calling is when (1-2X)*2=-1

X=3/4

So, if you expect them to always bet on 3 and bet at least 1/3 the time on 1, you should call, otherwise fold.

In an infinite round game where you eventually learn the other’s strategy, the equilibrium probability to bet on 1 would then be 1/3. In the actual game where 50 rounds is not a lot of time for this scenario to come up often, might be better to be more aggressive on 1, as I think people will fold more often than they should with 2s.

Some further tests:

No pure nash equilibria, every strategy has other strategies that win at least 2 chips over 12 games

Mixed strategies: Since there are 5 choices that matter, I modeled mixed strategies as choosing randomly for each choice: probabilities were 0,.25,.5,.75, and 1. There are a total of 3125 such strategies. Playing them all round-robin, ten rounds for each possible card, yields a couple winning strategies but tons of noise there. I’m going to split the game up into first player strategies and second player strategies, of which there would be 125 and 25 respectively with 5 possible probabilities to analyze further.

The game theoretical solution is tractable in this case. I am curious how close it is to the actual winner.

Player 1 may as well decide how he will respond to a bet at the get go. So he has three options: Firstopen, CheckCall or CheckFold. I’ll suffix them with 1, 2 or 3. Player 2 has to choose between Call and Fold, and between Secondopen and Check.

Clearly dominated options: CheckCall1, CheckFold3, Call1, Fold3 and Check3 are clearly bad.

Subtly dominated options: Firstopen2 is dominated by CheckCall2, unless you think the opponent is dumb enough to play Call1 or Fold3. Similarly, Secondopen2 is dominated by Check2, unless you think the opponent is dumb enough to play CheckCall1 or CheckFold3. With everyone having time to sit down and prepare, I’m going to assume that my opponent won’t make these basic mistakes.

One then gets a linear program. I can write out how to think about it if people like (or if they claim I am wrong) but I get the following:

Player 1: Firstopen1=CheckFold1=1/2; CheckCall2=2/3 and CheckFold2=1/3; always Firstopen3.

Play 2: If bet to, always Fold1, Call2 with probability 1/3, always Call3. If checked to, Secondopen1 with probability 1/3, always Check2, always Secondopen3.

The surprise for me was that you should never CheckCall3 (in other words, sandbag with the best card). I thought it would be worth trying to trick the opponent into Secondopen1, but it seems that you always lose more by scaring him away from Call2.

I assume someone in the class did this analysis, so I am curious how well they did. Thanks for the fun problem!

Trying to state my strategy more briefly: Holding a 3, be as aggressive as possible. Holding a 2, wait for the opponent to make the first move, then call with probability 2/3 in first position and 1/3 in second position. Holding a 1, bluff with probability 1/2 in first position and 1/3 in second position.

Simulating all possible mixed strategies that involve a probability of 0,1/3,1/2,2/3,or 1 for any of the 5 possible choices in a round robin shows yours is not optimal.

The five choices I’m making are as follows, in your terminology

1. Firstopen1

2. Secondopen1

3. CheckCall2

4. Call2

5. Firstopen3

I’m working with 5-tuples of probabilities. So your strategy is (0.5, 0.33, 0.66, 0.33, 1).

The strategy that dominates yours most based on simulation is (0.5, 0.66, 0.66, 1, 1)

In 10,000 rounds where each round had all 12 games, that strategy won a total of 3337 chips from yours.

Simulating 10 rounds between all such strategies, (0, 0.33, 0.66, 0.66, 0) has the highest min score but there are a ton of strategies that are close to it, need to dig deeper

Thanks for checking this! I’ll try figure out where I went wrong later then. I checked your analysis up to where it got to the list of 32 strategies, and it made sense so far.

here’s my current code, should be easy enough to understand and tweak to run the tests

https://pastebin.com/akYbFJEq

After waay too long (3+ hours) fumbling around with two friends I finally remembered how to calculate nash equilibrium, so got that part down:

player one:

1: bet 1/3 of the time

2: check

3: bet

player two:

1: check/fold

2: if bet: call 1/3 of the time, else check

3: if bet: call, if check: doesn’t matter for nash

player one:

1/2: fold (only happens if player two bets his 3:s)

given that the game isn’t seeded with intentionally bad bots, I’m not sure how much effort I’ll put into deviating from this, but I haven’t thought about it too much yet.

seeing as other’s attempt at NE gave different results I’ll try to show my work in a bit :)

Writing it up and comparing to David Speyer I realize that I incorrectly assumed that checkcall2 and secondopen1 were dominated strategies! Back to the drawing board..

In above terminology, this strategy is:

(.33,0,0,.33,1)

In a large round robin against all other mixed strategies, this ended up with a negative score, or worse than median. Running a tournament specifically to see which strategy dominates this most, I come up with (0.5, 1, 0, 0.5, 0.33)

To interpret this, refer to this list, and the number is the probability of calling/betting

1. Firstopen1

2. Secondopen1

3. CheckCall2

4. Call2

5. Firstopen3

So:

player 1:

1: bet with p(1/2)

2: check/fold

3: bet with p(1/3)

player 2:

1:bet

2: call with p(1/2)

3: bet

It’s possible I have a bug, code is posted above.

Thank you! Yeah I realized it was flawed, my current best guess is (.33, .33, .75, .33, 1) but the .75 sounds suspiciously high to me so I’m not too sure about it. Seeing that firstopen3=.33 beats me sounds weird to me as well as I have that assumed as a 1. I’ll work further and check out the code after doing some other stuff :)

After another hour or two here’s my new suggestion for NE:

Firstopen1 = 1/3

SecondOpen1 = 1/3

CheckCall2 = 2/3

Call2 = 1/2

FirstOpen3 = 1

I gotta rush, but I’ll show my work later and compare it to others (unless I find another flaw in my thinking/assumptions).

You should not Firstopen1. Playing the same strategy except with 0 there dominates yours. Do the EV calculation:

Equally likely for player 2 to have 2 or 3. If they have 3, then betting means lose 2 and checking means lose 1. If they have 2, then betting means you either win 1 or lose 2 depending on if you’re called, since Call2=1/2 this averages to lose 1/2. If you don’t bet, then they won’t so you lose 1.

So, betting gives you a 50% chance of losing one more than you’d otherwise lose, and a 50% chance of losing 1/2 less than you’d otherwise lose. Clearly negative EV.

Hmm, I’m going to check out the simulations for real now.

The reason you Firstopen1 is not to directly gain more points, but to incentivize Call2, making you gain more when you Firstopen3. So you cannot do a straight EV calculation on how you gain the most money only by looking at the case when you start with a 1, you also have to include the 3 case.

Well, if you’re in an NE, then no changes to strategy should have positive EV.

yeah, your simulation proves me wrong (assuming the simulation is correct, which would be impressive if it weren’t after how much you’ve played with it). It sure is really hard to *actually* find the NE, so many things can go wrong.

To take a first stab at the meta-game here:

It’s possible for them to “cooperate” by recognizing each other and deliberately throwing the game to one program. I’m going to assume none of that happens.

50 games is really not a lot of time to learn much. There are 12 possible games, most of the games you’ll play you’ll have only seen relevant info a handful of times prior.

The 5 choices can be summed up as:

Bluff on 1 (whether first or second)

Call a possible bluff on 2 (whether first or second)

Bid first on 3 in hopes of inducing a call of 2 (only when first)

These are all ultimately the same situation from various perspectives, so the probability you should choose each one depends on the others in some way.

I’m going to build out an EV calculation on each of these options. Call the probabilities a,b,c,d,e in order. For each, if they’re not 0 or 1, then you should be indifferent to changing when playing against this strategy:

bluffing on 1 when first: if they have 3, bluffing just makes you lose a chip. If they have 2, bluffing gives you (1-d) chips. Since the average of these numbers is always negative (at best 0), bluffing a 1 is always bad or neutral. If a is part of the strategy, then d=0.

bluffing on 1 when second: the probability they bet on 3 is e and on 2 is 0, so observing that they didn’t bet means the probability they have 3 is (1-e)/2. If they have 3, bluffing makes you lose a chip. If they have 2, bluffing means you win (1-c) chips. Total expected EV of bluffing: (e/2+1/2-c)

If b is part of the strategy, then e=2c-1

calling bluff with 2 when first: seeing them bet means the probability they have 3 is 1/(1+a). If they have 3, calling loses a chip, if they have 1, it gains a chip. Expected gain is (a-1)/(1+a).

If c is part of the strategy, then a=1

calling bluff with 2 when second: seeing them bet means probability they have 3 is e/(e+a). Expected gain from bluffing is (a-e)/(e+a).

If d is part of the strategy, then a is 0 (from above). So setting to 0 and setting a=0, we get it’s impossible to make this equal to 0. So d can’t be part of it, and so d=0. The only exception is if e=a=0, in which case this scenario never comes up.

Bid first on 3: if they have 1, you lose b chips doing this (since they might have bluffed). If they have 2, you gain d chips (since they’ll never bet, and they have d chance of calling). Total expected gain is (d-b)/2.

If e is part of the strategy, then d=b.

Since 2 different things pointed at d=0, let’s start there. Then we have a free and one of b or e=0.

First, set b=0. We have e+11/2,0,0), (0,1,c1/2,0,0)

(0,1,c<1/2,0,0)

(a,1,0,0,0)

(0,1,0,d,0)

The only move ruled out entirely is opening with a 3. I'll check these 5 further and see if any of them are dominated. It seems at most one choice can vary, not sure I did this all right though.

Some of this seemed to have gotten cut off. This is what came after “First, set b=0”

First, set b=0. We have e+11/2,0,0), (0,1,c1/2,0,0)

(0,1,c<1/2,0,0)

(a,1,0,0,0)

(0,1,0,d,0)

The only move ruled out entirely is opening with a 3. I'll check these 5 further and see if any of them are dominated. It seems at most one choice can vary, not sure I did this all right though.

Still getting cut off, maybe some kind of bug with the less than sign? Trying again.

First, set b=0. We have e+1 less or equal to c, so e=0 and c=1. So either way e=0, so we can make b free again.

d,e are 0, a,b are free.

If a is part of it, then c is not and must be 0 (always negative EV unless a is 1). If c is part of it, then a is not and must be 1.

If b is part of it, then c=1/2. So that yields one strategy: (0,b,1/2,0,0), with free b

if b isn’t, then b=0 or 1 depending on whether c is above or below 1/2. In this case, c could be part of it, in which case we get the following strategies: (0,0,c>1/2,0,0), (0,1,c1/2,0,0)

(0,1,c<1/2,0,0)

(a,1,0,0,0)

(0,1,0,d,0)

The only move ruled out entirely is opening with a 3. I'll check these 5 further and see if any of them are dominated. It seems at most one choice can vary, not sure I did this all right though.

part after the “in which case we get the following strategies” line in previous comment:

or, c could be 0 or 1 and a could be the only variable, in which case we get this: (a,1,0,0,0)

If d is not 0, then a=e=c=0, b must be 1 (since bluffing always has positive EV with e and c=0).

This gives (0,1,0,d,0)

So the various possible strategies are:

(0,b,1/2,0,0)

(0,0,c more than 1/2,0,0)

(0,1,c less than 1/2,0,0)

(a,1,0,0,0)

(0,1,0,d,0)

The only move ruled out entirely is opening with a 3. I’ll check these 5 further and see if any of them are dominated. It seems at most one choice can vary, not sure I did this all right though.

well, 2 choices can vary if you set b to something and c is 1/2.

I’ve spotted some mistakes in the calculations here, redoing

Pretty sure this is correct: only nash equilibrium is

(0,1/3,1/3,1/3,0),

The trickiest part when doing it is actually counting the correct number of expected gains for each decision – I had several errors along the lines of assuming you’d win 2 chips when you’d really get a counterfactual gain of 3 chips if bluffing.

Assuming this is right, then the meta-game is really just asking: should you diverge from the only NE? Diverging is dangerous with uncertain payouts, but you need to do something to be more than average.

I guess basics like checking if their percentages are wildly off after the 25th round and then taking advantage if so should be relatively safe (for someone to exploit this, they’d have to perform suboptimally for half the game which shouldn’t be recoverable in the second half). If they deviate from the 2 0s or do any moves that are dominated, can also pick up a handful of points, and just monitor then to make sure they aren’t baiting you. I feel like a rule of thumb should be that you can deviate to chase non-optimal bots, but only for as many rounds as they have been non-optimal for, and if they end up ahead of you at any point then revert to nash.

I did this in my head over ramen but I think the NE is really simple. As far as I can tell its just:

3 -> always bet/call

2 -> never bet, call with p = 0.5

1 -> never call, bet with p=0.5

I will check my math tomorrow and say if its right or wrong and why (unless I just made a dumb mistake)

To be honest I just assumed you always bet on 3. To get the probabilities for ‘what to do if you have a two’ assume you want to be indifferent to facing ‘always raises on 1 dude’ and ‘never raises on 1 dude’. Also it obviously makes no sense to raise with a 2. They either have a 3 and will call or they have a 1 and will fold.

I found the following Nash equilibrium:

If you’re first: always bet on a 2 or 3; bluff on 1 one third of the time

If you’re second against a check: bet on a 1 or 2 one sixth of the time, always bet on 3

If you’re second against a bet: fold on 1, call a bluff on 2 one third of the time, always call on 3

Betting on 2 first is dominated by not doing so. If they will beat you, they’ll call, otherwise they’ll fold.

Checking on a 2 seems like a sensible decision, but in reality you’re just delaying the same choice. Checking and calling leaves you no better than before. Checking and folding means you lose when the second player’s bluffing on a 1. You’re just hurting your score that way.

The Nash strategy is to check then call one third of the time. They’re certain to bet with a 3 and not certain to bluff with a 1, while if you initiate the bet you have no upside.

Let p be the probability of them betting second on a 1.

If you bet on a 2, you get net -1 (+1 against a 1 and -2 against a 3).

If you check and call on a 2, you get +2p against a 1 and -2 against a 3, for a total of -2+2p.

If you check and fold on a 2, you get -1 whatever your opponent has.

You break even only if your opponent is betting on 1s more than half of the time; otherwise, checking does worse than betting.

If you bet a 2 against a 1, you gain nothing, since they’ll fold and you’d have won anyway.

From a misplaced comment:

«I just did the calculations above. Folding leaves you no better than betting on average, and checking only makes you better off if you’re expecting your opponent to bluff more than half of his 1s.

»Essentially, what you’re doing by checking is letting your opponent call the priors on his having a 1 versus a 3. If you bet, the odds are 1 to 1. If you check, the odds are p to 1, and you bet p is going to be less than 1. By checking on a 2, you’re giving your opponent so many more 3s to work with.»

—–

I’ll add the following: this can be summarized by saying this game has a first-move advantage. By failing to bet on a 2, you’re giving up that advantage.

if you check/call a 2 against a 1, and the probability they bluff is p, then either

1. they bluff, you call, you win 2 (probability p)

2. they check, you win 1 (probability 1-p)

EV = 2p+1-p=p+1, which dominates the 1 from betting.

I had like 10 of these subtle errors when calculating the nash equilibria. Every time I’d come up with a strategy, simulate it, see it lose consistently against other strategies, dig into the games, and realize you actually won one more chip here or lost another chip there, go back and fix it, repeat.

It’s a surprisingly hard game to analyze.

I noticed my mistake now. Thank you.

Turns out there’s actually a first-move disadvantage. Player 2 wins by playing aggressively (betting on 1s and calling on 2s) one third of the time, and otherwise checking and folding where appropriate. Player 1 can’t do better against this than losing 1/3 of a chip on average no matter what they do.

Player 1 has some room to play given player 2’s strategy, however. Player 2 can’t punish player 1 for betting on a 1, like you suggested on a different comment. This strategy

bet on a 1 one third of the time;

check on a 2, call 2/3 of the time;

always bet on 3;

does just as well as always checking and calling 1/3 of the time. Any convex combination of those will also do just as well.

Any deviation from (0,1/3,1/3,1/3,0) strategy leaves room for exploitation.

If the first player calls 2 2/3d of the time instead of the nash 1/3, then 1s will stop bluffing against him and 3s will continue betting, and he’ll lose.

If you’re second with a 1 and the first player checks, you have no reason not to bluff since you know they have a 2. If you bluff now, they’ll call your bluff 2/3 of the time (-4/3 to you) and fold the remaining 1/3 (+1/3 to you), which is net -1, the same you’d get for checking anyway.

You can’t count on them being suboptimal. Optimal behavior is to call the bluff 1/3 of the time.

also, you don’t know they have a 2, since betting first with a 3 is not forced.

«You can’t count on them being suboptimal. Optimal behavior is to call the bluff 1/3 of the time.»

Against the above strategy, any decision at that point yields the same expected value (they are all equally optimal). We’re calculating a (weak) Nash equilibrium; it suffices that the second player have no incentive to switch strategies.

«also, you don’t know they have a 2, since betting first with a 3 is not forced.»

Against the above strategy it is (first player always bets on a 3). And at any rate, it’s still the case that bluffing gives you the same expected value whether or not you think your opponent has a 2.

Against my first-player strategy, the second player can’t do any better than the +1/3 they’re already getting for playing your second-player Nash strategy.

There are no other nash equilibria besides the one I derived.

What’s the full strategy you’re proposing? I’ll find you a strategy that beats it.

What I’m showing is precisely that there are other Nash equilibria besides the one you found; namely (to use your notation), (1/3,1/3,2/3,1/3,1) and any convex combination of that and the one you found.

hm you’re right, I missed this case in my analysis.

Redoing it I come up with this set of strategies:

(x/3,1/3,1/3+2x/(3x+3),1/3,x)

and x varies from 0 to 1. If x is 0 it gives my previous strategy, if it’s 1 it gives yours. Does that sound right?

Yay! That’s almost the exact same numbers that I came up with. Only one different is I had d at 1/2 instead of 1/3, and I know I did that one incorrectly.

Increasing x gives higher total EV for player 1, and as long as the correct ratio is conserved in other numbers player 2 can’t do anything about it, so player one should set x to 1.

Avi: the correct form of the general convex combination is

(x/3, 1/3, 1/3, (1+x)/3, x).

One ought to show that there are no other Nash equilibria, which I haven’t done.

h6: Increasing x is not supposed to increase player 1’s expected value (that’s why it’s a Nash equilibrium strategy). For any value of x, it is expected that player 1 loses 1/3 of a chip on average.

Pingback: Simplified Poker Strategy | Don't Worry About the Vase

I just did the calculations above. Folding leaves you no better than betting on average, and checking only makes you better off if you’re expecting your opponent to bluff more than half of his 1s.

Essentially, what you’re doing by checking is letting your opponent call the priors on his having a 1 versus a 3. If you bet, the odds are 1 to 1. If you check, the odds are p to 1, and you bet p is going to be less than 1. By checking on a 2, you’re giving your opponent so many more 3s to work with.

No.

Probability of them bluffing 1 if you check is p. Probability of them betting 3 is 1. Probability of them folding 1 if you bet is 1, probability of calling 3 if you bet is 1.

If you bet on 2, your payoff is either +1 or -2, equal odds so EV is -1/2

if you check/fold, your payoff is either (1-2p) or -1, average of -p

if you check/bet, your payoff is either 1+p or -2, average of (p-1)/2 which dominates -1/2

A mixed strategy of check, then 1/3 bet and 2/3 fold yields expected (-1-p)/6, which is always strictly greater than -1/2.

sorry the mixed strategy part is wrong, the rest is correct

How does increasing x increase EV? The EV of changing any choice when playing against this should be 0 for all values of x.

@fabio: 2nd and 4th numbers have to be the same, if 4 is more than 2 then the 5th should be 1.

I realized I had one other error in the formula, this is my current version of the calculation for the formula for the 3rd number:

bluffing on 1 when second: the probability they bet on 3 is e and on 2 is 0, so observing that they didn’t bet means the probability they have 3 is (1-e)/(2-e) and the probability they have 2 is e+1/2. If they have 3, bluffing makes you lose a chip. If they have 2, bluffing means you win (2-3c) chips.

total expected EV of bluffing: (e-1)/(2-e) + (1+(e-1)/(2-e))*(2-3c)

Setting to zero and solving yields c=e/3+1/3

General formula now becomes

(e/3, 1/3, e/3+1/3, 1/3, e)

second probability is off, should be 1+(e-1)/(2-e)

Yeah, that’s what I got (except I misplaced the third and fourth components).

Pingback: Rational Feed – deluks917

Pingback: Simplified Poker Conclusions | Don't Worry About the Vase

So after waaay to much thought, I got the key intuitions to simplify the problem:

0. Always call with a 3, always fold with a 1.

1. Never bet with a 2; all this does is give an opponent with a 3 an extra chip.

2. Player 2 has no reason not to bet with a 3.

3. Call vs. fold with a 2 is a break-even decision when the probability of the opponent bluffing is 25%.

From there I mathed it out and found the obvious Nash equilibrium: both players always bet with a 3, and bluff (i.e. bet with a 1) with 1/3 probability. Holding a 2, the second player calls with 2/3 probability, the first player with 1/3 probability.

But…

Zvi’s comments mention also maximizing against the field. “Never bet with a 2” is a non-obvious decision; many players’ will instinctively bet with a 2 more often than they bet with a 1. So one obvious arbitrage opportunity is: as player 1, always check! The downside (losing your bluffing power) is small, and if your opponent bets with a 2 you can punish them hard. Also, you’ll quickly find out whether they’re inclined to do so. If they aren’t, they’ll check with their 2 once every 6 hands on average, which you can observe and switch back to equilibrium play.

Pingback: Player of Games – Put A Number On It!