The Darwin Pregame

Epistemic Status: True story

This is intended as post two of the sequence Zbybpu’f Nezl.

Previously (required): The Darwin Game

Leads to: The Darwin Results

I

This is my reconstruction of my thoughts at the time.

The Darwin Game requires surviving the early, middle and late games.

In the opening, you need to maximize scoring against whatever randomness people submit. Survival probably isn’t enough. The more copies of yourself you bring to the middle game, the more you face yourself, which snowballs. Get as many points as you can.

In the middle game, you face whatever succeeded in the opening. Strategies that survived the opening in bad shape can make a comeback here, if they are better against this new pool. What strategies do well against you matters.

In the end game, you’ll need to beat the successful middle game strategies, all of which have substantial percentages of the pool. Eventually you’ll be heads up against one opponent. Not letting opponents outscore you in a pairing becomes vital.

How would the game play out? What types of strategies would thrive?

I  divided the types as follows:

There were attackers, who would attempt to get the opponent to accept a 3/2 or 4/1 split. They might or might not give up on that if you refused, and presumably most would use a signal to self-cooperate, but not all. One person did submit “return 3.”

Then there were cooperators, who attempt to split the pot evenly. I assumed that meant alternating 3/2 splits. This then divided into those who would fold if attacked, allowing you to score above 2.5 per turn, those that would let themselves be outscored but would make sure you scored less than 2.5 per turn, and those that would not allow themselves to be outscored. The last group might or might not forgive an early attempt to attack them.

There would also be bad programs. People do dumb things. Someone might play all 2s, or pick numbers fully at random, or who knows what else.

As a list (attackers from here on means both AttackBot and BullyBot):

AttackBot. Attackers who don’t give up.

BullyBot. Attackers who give up.

CarefulBot. Cooperators who harshly punish attackers.

DefenseBot. Cooperators who don’t let you outscore them but don’t otherwise punish.

EquityBot. Cooperators who let you outscore them, but make sure you don’t benefit.

FoldBot. Cooperators who accept full unfavorable 3/2 splits.

GoofBot. Weird stuff.

My prior was we’d see all seven, with most looking to cooperate.

Was attacking a good strategy?

Attacking only works against FoldBots. When attacking fails, even DefenseBots might take a while to re-establish cooperation. CarefulBots could wipe you out. It was also impossible to know how long to keep attacking before concluding opponents weren’t going to fold.

With a pool of bots chosen by humans, attacking strategies (AttackBot or BullyBot) likely would fail hard in the opening.

The endgame was a different story. All GoofBots would be dead. Unless FoldBots fold too quickly to a BullyBot, in a given round they strictly outscore CarefulBots, DefenseBots and EquityBots. Each round, provided they exist, FoldBots would become a bigger portion of the cooperative pool. If you were an AttackBot or BullyBot, and survived long enough, you would kill off the CarefulBots, then the DefenseBots and finally the EquityBots as the FoldBots out-competed them, leaving a world of AttackBots, BullyBots and FoldBots. If all but one attacker was gone, the last attacker to survive would win if it cooperated efficiently against itself, since it would score above average each round. In theory a steady state could exist with multiple attackers keeping each other in check, but that isn’t stable since advantages in size snowball.

CarefulBots are strictly worse than DefenseBots, so those were out. GoofBots are terrible.

This meant there were five choices:

I could submit an AttackBot that cooperates with itself, and hope to survive into the endgame. I quickly dismissed this as unlikely to work.

I could submit a BullyBot that cooperates with itself, attacks but accepts an even split against stubborn opponents. But this rewards stubborn opponents while wiping out non-stubborn opponents in the mid-game, which means your endgame trump card stops working. I dismissed this as well.

DefenseBots don’t lose heads-up by non-tiny amounts, and punish anyone who tries to outscore them, wiping them out in the mid-game. But you score nothing against AttackBots in the opening, before you can shape the pool much. At best you take a smaller pool into the mid-game, where efficient cooperation with your own copies starts to snowball.

I saw the emotional appeal of DefenseBots, but using one didn’t made sense. Its defenses were too robust and expensive, and you still lose to a smart AttackBot heads-up if you’re outnumbered. I’d need to take more risk.

That was the problem with being a FoldBot. FoldBots feed attackers. You are free riding on the rest of the cooperative pool. You hope they kill attackers despite that. The problem is that if even one copy of an attacker survives, as you and other FoldBots grow strong, attacking becomes a better and better strategy. I decided this wasn’t worth that risk.

I would submit an EquityBot. I wouldn’t protect against them outscoring me. I would protect against them outscoring what cooperation would have gotten them. If at any point they wanted to split the remaining pie, I would accept. Even if they refused, I’d give them some points on a 3/2 split, so long as they were punished for it, and I wasn’t growing their portion of the pool.

This raised the threshold percentage of the pool I needed to win heads-up against an attacker, but with a size disadvantage I’d lose no matter what, and I’d still win if I had a sufficiently large size edge, which was more likely if I did better early on.

Too much folding and you strengthen someone who beats you. Too little and you fall behind letting others snowball.

I decided to alternate 3/2 even if my opponent was going 3/3. This said both ‘I’m not going to give up’ and ‘you are welcome to cooperate at any time,’ and still punished the opponent reasonably hard. After long enough I even risked throwing in a few more 2s.

I considered sending a signal to recognize myself, but realized there was no point. Better to start coordinating right away. I’d randomize my first turn to 2 or 3, and once my opponent didn’t match me I would alternate. I figured opponents would start 2 more often than 3, so I decided to do a 50/50 split to take advantage of that, coordinating faster and with a slight edge, at the expense of doing slightly worse against myself, but this was probably just a mistake and I should have done an uneven split (but not quite the fully maximizing-for-self-play ratio). However, in an endgame against a similar program, you can definitely get an edge by being slightly more willing to play 3s early than your opponent.

Opponents that wanted to cooperate would have a very easy time recognizing my offer and cooperating. That left special case logic.

If my opponent was alternating on the same schedule as me (somehow we started 2/3, but then we’d 2/2 then 3/3 then 2/2), then I’d play 2 twice in a row to break that up. Ideally, if the opponent was offering a different cycle that was fair, I’d match that (so if they went 1/4/1/4, I’d submit 4 next time, and if they did 1 I’d start alternating), but I didn’t expect such cases so I didn’t make that logic robust, as the professor had already thrown out part of a previous submission for being too complex, and I wanted to preserve the more important parts.

If my opponent was playing all 2s even after I started alternating, I put in logic to play all 3s. If they played even one 3, I’d back down permanently. I also put in logic against a few other bizarre simple bots (like all 1s, all 4s, seems to be completely random, etc) but didn’t worry about it too much since they’d be wiped out very quickly and complexity is bad.

If my opponent was playing all 3s without a starting signal, and kept it up long enough, that meant he’d defect against himself, which meant he couldn’t win an endgame, and also meant that he was highly unlikely to ever give up, so I’d eventually fold. If they were going to lose in the long run, better to get what I could. Letting them survive longer would only help me.

II

David took a different approach.

David knew about the class mailing list.

David assembled a large group. They agreed to submit 2-0-2 as their first three moves. If both sides sent the signal, they’d cooperate using a reasonable randomization system. If they didn’t get the signal back, they’d play all 3s. They’d be pure CliqueBots, cooperating with each other and defecting against everyone else. With a large enough group, they’d wipe out the other players and share the victory. David would win The Golden Shark and his guaranteed A+.

I would find out about the coalition after round one.

III

We were all set for game night. We had each chosen the logical output of our decision functions. The professor set up a website where we could see the game played out in real time over the course of several hours (due to a combination of that’s more fun and the game was slow to run), with a discussion board for him to offer observations and us to comment.

Next time I’ll reveal what happened on game night. Predictions are encouraged. Please do not comment here if you have read The Darwin Results.

This entry was posted in Uncategorized. Bookmark the permalink.

15 Responses to The Darwin Pregame

1. I love the storytelling. Consider writing more things which are less meta, they seem to be fun and might plausibly have more educational value too!

• TheZvi says:

Appreciated! I definitely have a lot of stories to tell, if that’s something people would enjoy and/or benefit from. Curious if others feel the same way.

• Magister Ludi says:

Consider the motion seconded.

• Laura says:

I also found this enjoyable.

2. Pingback: Rational Feed – deluks917

3. David Speyer says:

Tough question! Both Zvi and DavidClone get 2.5 points per program per turn when you play yourselves. When you play each other, Zvi gets 1 point per round and DavidClone gets 1.5. Assuming you are the only programs that make the endgame, if you come into the endgame with proportion z for Zvi and (1-z) for DavidClone, then you end the round with proportion

$\frac{z(2.5 z + (1-z))}{z(2.5 z + (1-z)) + (1-z)(2.5 (1 - z) + 1.5 z))}.$

Here is a plot. Both z=0 and z=1 are attracting fixed points, z=0.6 is a repelling fixed point. You win the endgame if you come in with at least 60% of the pool, and lose otherwise.

Its hard to predict how much of the population you’ll be at the end of middle game. If there are a lot of CarefulBot’s or other punishers, the DavidClone’s will be wiped out. But he starts with a much larger population than you so, if you fare about equally well in middle game, he wins. If you forced me to bet, I’d bet on you, but I don’t feel confident.

4. PDV says:

Is it cheating to predict given that I’ve used the power of Rot13? If not feel free to delete this.

If the clique alliance stays loyal, they win. If the competition is steep enough in the midgame for them to reach a steady state, they’ll get reduced to sharing the ‘correct’ level for one of them. But they’ll be dominant in the chaotic early game, so that won’t happen.

However, I predict that they will not stay loyal. Some of them – David would be most poetic, but he just needs to not lose and has little to gain, so probably not him – will throw the correct ID but then use a different strategy if it fails. DefenseBot or EquityBot, probably, but they might brazenly play FoldBot against non-clique, trusting the clique’s defences to keep them ahead. If this happens, it deals a significant blow to the clique’s dominance, and leaves the beatable defector with a bigger share going into the lategame.

My strong prediction is that the coalition will fail enough for Zvi to get into the lategame in a competitive position. I more weakly predict that he will beat out the remains of the clique, because that’s a little too storybook/anti-Murphy to actually seem likely.

• Quixote says:

If we are using story logic, our hero should win. So Zvi gets it.

5. DealBots appear exactly once in the text of the post and their function is not explicitly spelled out anywhere. Is that intentional?

6. honhonhonhon says:

I think that you were wrong about BullyBots doing poorly early on. You establish early who the folders are, and you get more out of GoofBots while they are alive . FoldBots may even change their strategy to attacking if your history is long enough, since that implies the game has gone on long enough and you need to start killing people, so that’s another reason to bully early. I think equity and careful bots would be rare.

• honhonhonhon says:

Reading the next post, it appears I had misunderstood the rules: you can only read the history of the current round. Oh well.

7. Purplehermann says:

A few things will matter here: first off, how often does the group get matched with itself? (I get that it is random.) What percentage of the total are they?

They are basic attack bots when not with eachother, so running into enough bots who defend/punish early on should weed them out. Folders are very good for them.
Bullies can be fodder too.

They aren’t optimized for stupid or random, so that’s a disadvantage in the early game.

They get 4 out of 15 points in the forst 3 rounds, minor disadvantage. (2 4/1 2 would be better as a signal i think).

I remember seeing a study showing that cooperation beats clique with high randomness of match ups, Zvi winning didn’t seem out of the question.