This is a follow-up post to the last section of Ukraine Post #2 on the need for Better Decision Theory.
In particular I want to think more about the following result and some resulting logic and examples. If that’s not relevant to your interests and/or you’re fine with it being hand-waved later when I build upon it, you can skip or skim this one. It is definitely the first-draft longer version of something shorter that I have not yet had time to write.
Your Nature versus Your Decisions
The gap here is huge, a 24% net swing (a change of +/- 12% approving).
There is the obvious partisan divide on approving of Biden’s decisions (despite almost all Americans on both sides agreeing with the core idea of backing up Ukraine). The gap is mostly across the board.
Commentators notice they are confused. These are the first three topline responses.
Explanations are indeed offered as well, with varying degrees of plausibility.
I think there is a big difference between ‘decisions’ and ‘how you are handling’ something, and this is the heart of the problem.
I interpret this as the public, despite not knowing such fancy words as ‘decision theory’ or any of the technical thinking involved, intuiting the need for better decision theory. This is one of those places where ordinary person intuitions and models actually do remarkably well, because the dynamics have always applied to everyday life.
Thus, they intuitively notice three things in the context of Russia’s invasion of Ukraine (and then they adjust their answers for partisan bias) whether or not they are right about either of them.
- Biden seems to prefer better outcomes that we prefer, to worse outcomes.
- Given who he is, his/our preferences, the hand he was dealt and who people believe him to be, Biden has made good decisions.
- Who Biden is and who people believe him to be (e.g. his expected preferences and decision theory) are serious problems that lead to worse outcomes.
One could of course disagree that the outcomes Biden wants, here or elsewhere, are to be preferred – I am not making a strong claim here even locally, and definitely not generally, beyond that one could plausibly model his preferences in this way.
One can consider the reversed case some people might make for Putin or Trump, if one didn’t want to defend the decisions made or their stated or revealed preferences. Not that you couldn’t mount a defense if you wanted to do that, but to point out you don’t automatically have to do this to defend them. One might claim that:
- Putin seems to centrally prefer worse outcomes (for us anyway) to better ones.
- Given who he is, his preferences and the hand he was dealt, and who we believe him to be, Putin has made poor decisions.
- Who Putin is and who people believe him to be (e.g. his expected preferences and decision theory) are serious advantages that lead to outcomes Putin prefers.
Again, we’d like to think that we don’t take kindly to the things Putin has been saying and doing, and that we come together to foil the plots of those who choose to act as cartoon villains. That we punish people who visibly have the mafioso nature.
Instead, we reliably reward such villainy, and this is known, thus encouraging more villainy. Biden is widely believed to be ‘making good decisions’ given a framework and expectation that villainy will be rewarded, to make the best of the situation, but that expectation was necessary for there to be a war in the first place. People notice, and pointing this out in regular-person terms (e.g. ‘looking weak’) has been part of Republican rhetoric for a long time, in places both accurate and others that are both inaccurate and at times absurdly silly.
On top of that, there is a pattern that says that transgressors and those who destroy value will win and be rewarded, and thus gain power. Others will back down, will silence the opposition to avoid trouble and get rewarded by that future power, and let them do this. The power that comes to such villains comes from the expectation of their future power, which is based on others’ expected expectations. Thus the scene in many a show or movie, and the similar examples in real life, where the Big Bad walks into town or looks like they’re gaining traction, and all the Little Bad ones out there fall into line even though it never, ever ends well.
Looking at something’s potential future power causes that power to manifest here and now, thus causing that future power. Even if that future probability is very low, the penalty extracted is promised to be super high, so are you going to take that chance?
It is the essence of much of politics.
A Community I know has an excellent word for this. They call it a Basilisk.
A great illustration of both sides of this is this prominent Twitter post by the President.
As a set of two decisions these are obviously correct decisions. We will defend every inch of NATO territory because to do otherwise collapses all of our commitments and leads to chaos, which would eventually force escalations anyway, so even if you don’t do it ‘on principle’ you do it anyway. And yes, of course, all the talk of a ‘no-fly zone’ or other direct confrontation with Russia is completely off the table for damn good reasons.
And yet, tons of people who agree with both of these positions absolutely hated the ‘decision’ to post this, and the mindset that would think this was an acceptable thing to post, mostly using variations of this argument.
I shared Nixon’s interpretation below of what Biden intended, but that doesn’t mean that’s what people saw.
Biden is being the ‘rational’ party here, doing the ‘right things’ and making the ‘right decisions’ but also can easily be interpreted as giving an invitation to be walked all over.
That’s especially true if you read it ‘as if you were Putin’ using some sort of Inner Putin. If we won’t fight a war in Ukraine, and indeed won’t ‘directly confront’ Russia period aside from NATO (even if we include other explicit defensive pacts like Japan), and if Russia assumes that only direct confrontation matters, then Russia is free to do anything at all that it wants in Ukraine, including things like massacre civilians, use biological or chemical weapons, or perhaps even nuclear weapons, in order to cower the Ukrainians into giving up.
And it also implies that if they continued to Moldavia, then Kazakhstan, then all the other non-NATO former USSR countries, again, so what?
Before the invasion began, Biden said explicitly we wouldn’t directly intervene but promised ‘severe consequences.’ Putin presumably interpreted this as mostly a bluff. Sure, there’d be some sanctions, but nothing he couldn’t handle. That was through some combination of:
- Having the mafioso nature and thus doing always escalating against those who lack this nature on the assumption they always fold and are always bluffing.
- Not being able to recognize the impact sanctions would have and thinking he was prepared because no one is capable of telling Putin the truth.
- Not anticipating the scope of the sanctions.
- Expecting a fait accompli to render the issue moot within a few days.
- Assuming such things were inevitable either way.
- (Maybe) Caring enough about getting Ukraine to not care.
- (Maybe) Thinking threats of nukes would stop us from even doing that.
Given Putin already blew past one ‘severe consequences’ stop sign under false impressions and/or because he didn’t care, it’s a reasonable position to doubt he’d care about another one and treat this as a green light.
Or, more precisely, from the perspective of the mafioso nature, this is Biden explicitly pointing out the light is green, and such talk only reveals that this person is weak and always bluffing. Biden being Biden already made the light green, he won’t let stupid public opinion or political grandstanding force his hand when this much is at stake. Which is to his credit, but also it being common knowledge is a problem – it means Biden has destroyed his potential commitment devices.
Last time, I thought about the question: Why does the public support a no-fly zone? When it means NATO and Russia shooting at each other, and thus often leads directly to World War III?
Some of it is people not understanding that this is what a no-fly zone means, people simply see it as a Something therefore we must do it, or a Something that is more than whatever we’re doing but less than ground troops, or it’s something they remember us doing in other situations where something we didn’t like was happening. And I suggested that maybe many people don’t much care about living anymore or think the world is ending soon anyway or similar things, which is something I’ve worried about for a while now and really would like to understand better some time, the whole thing seems quite terrible and I don’t get why it’s happening.
On another level, one could also answer that the American public supports a ‘no-fly zone’ because decision theory. Handed down in the form of culture and instinct and system design rather than explicit theory, but with the same effect.
You want your leader to be capable of making good decisions. You also want your leader to have access to various commitment devices, and be able to credibly make threats that they will take actions that might from some points of view not strictly make sense.
It is the public’s role to play in the system to get angry, and to want to punish the offender, and demand that Something be done. Then if it didn’t work, demand Something more. The point of the public supporting locally dumb decisions is so that potential villains know that if they push too hard, not giving in to them stops being locally dumb from the perspective of public opinion, and starts looking locally smart, then looking locally necessary. This is a built-in, impossible-to-control push to escalate. It has limited authority, so when it’s sufficiently over the top stupid it can be ignored, but it matters. At a minimum, it is used to justify lots of other actions in lieu of the thing the public says they support.
This has worked in dragging us along farther and faster than the government ‘wanted’ to otherwise go, and in establishing that it will do that again in the future.
Yet it also means that the public’s job is to be unhappy with the situation, even if it supports the individual decisions, which again points back to the gap in evaluations.
In a way, we perhaps outsource the mafioso nature to this kind of distributed public and to social media and various other dynamics, thus allowing some benefits of the nature without bearing some of its costs. One problem with this dynamic is that it is easy for those with the nature to not notice this, and thus execute the wrong program (I originally wrote miscalculate, but this nature does not calculate).
This has a lot in common with Presidents being known to be rewarded or punished largely on the basis of whether conditions are improving or not and the state of the economy. The President doesn’t have that much control over economic conditions, but it is a physical world hard-to-fake measure of how things are going. Throwing the bums out when it looks bad isn’t all that accurate, but giving a bunch of weight to it is better than getting your process hijacked. This in turn leads to hijack by the dialectic, among other problems, but it’s at least a start.
Contrast that with the latest entry in people claiming that Trump said the latest candidate for the craziest thing he ever said, which slightly but importantly mischaracterizes a thing he did indeed say (45 second video).
The claim, from the Washington Post:
Former president Donald Trump, meanwhile, suggested on Fox News Thursday night that Biden should respond to the invasion by personally threatening to obliterate Russia with nuclear weapons. He decried Biden as weak for failing to do so.
The actual thing he says that he would say is:
- We (as opposed to Russia) are a nuclear nation.
- We have built up a bigger, better
- We ‘don’t want to have to wipe out Russia.’
Note that he didn’t say that Biden should say that. He said he would have said it.
He knows Biden can’t say that. No one would take it seriously.
It would be like me walking into the local pizza place and saying “Nice pizza place you have here. Shame if something were to happen to it.” And the host would say thank you, I agree with your statement, would you like a table? Then either I would notice that didn’t work at all and leave, or I would enjoy a delicious pizza for Pi Day, leave a nice tip and go.
Trump has the mafioso nature in this sense. This allows his brain to generate the hypothesis that he should threaten to nuclear annihilation, that makes his threat plausibly more salient than his enemy’s threats, and that makes it thinkable to worry about whose
dick nuclear arsenal is ‘bigger and better’ and causes the rubble to bounce additional times. And it automatically translates the words into the mafioso language.
Putin’s brain works the same way here. Worth noticing the previous clip also, where Trump claims Putin said he wouldn’t invade Ukraine while Trump was in office.
There’s always the question of what happens when two people with this nature face off. Sometimes there’s a fight (or war) and someone ends up injured or dead, but not that often. Thus, there’s usually some combination of looking into the future to see who would win the fight resulting in the future loser giving in now, thus establishing an efficient dominance hierarchy, and a general tendency to notice each other’s natures and thus conspire together against anyone lacking a similar nature.
If you divide the world into those having and lacking the mafioso nature, as those with this nature seem to often do, then it makes sense to align with those who have it against those who don’t. If you’re familiar with the background, one can compare and contrast this with the idea that those with the maze nature align with each other to increase what I call ‘maze levels.’ The dynamics are at least similar.
This all goes hand in hand. In order to speak this language and wield this leverage you need to provide overwhelming evidence that you have the mafioso nature, that you will endlessly escalate until confronted by a superior force that is also willing to endlessly escalate. The only known way to do this involves actually having the mafioso nature. That’s a problem, because then you’ll think in zero-sum terms and also make dumb decisions.
We would like the best of both worlds. It would be great to get someone who:
- Wants things that are good and positive sum.
- Makes smart decisions rather than dumb decisions.
- Benefits from the belief you have the mafioso nature, or otherwise might limitlessly escalate or do something disproportional if provoked.
By default, you get one. You can sometimes get two. But three?
Alas, that seems to mostly be impossible.
There are two possible approaches to solving the obvious contradiction.
The first and most obvious is to pretend to make dumb decisions and/or to want bad things.
This has been tried, most notably by Nixon with his Madman Theory. Nixon was trying to pretend to make dumb decisions while instead make smart decisions. Yet his decisions led directly to his removal from office (and also he imposed price controls) so while I’m a big fan of the Richard Nixon Twitter account, I don’t think the real Nixon did a great job in the ‘make smart decisions’ department.
Trump and Putin are the other potential examples listed in the linked Wikipedia article, and those also seem like examples of people who made poor decisions.
It is a weird and potentially incoherent sentence to say that if I were Putin or Trump I would have been able to make importantly much better decisions. But put me in their shoes and I couldn’t actually play their part at all. No chance. My ability to see that many of their decisions are dumb goes hand in hand with not being able to mimic their general ways of being.
That does not seem like a coincidence. Such things are very hard to fake, and one who fakes them or learns to go through such motions usually Becomes the Mask. This is a central theme of the Moral Mazes sequence.
The much easier method of wearing a mask is that it isn’t, or is no longer, a mask.
The whole idea of Madman Theory is that from a traditional perspective you want people to think that you don’t mind bad things, and might make very dumb decisions.
‘Dumb decisions’ is also shorthand and imprecise. What you want is to display very specific types of patterns and tendencies. You want a reputation for making a particular kind of locally dumb decision, in order to get those around you to engage in the behaviors you want, in ways that imply you’ll do this even when the logic behind it stops applying. To some extent this is about controlling their incentives, but to a grater extent it’s about controlling their perception of their highly localized incentives and scaring them into some combination of doing or not doing even things they think you might care about or might in the future care about, combined with a general paralysis and fear of doing anything at all they weren’t told to do.
This creates a non-local incentive to want nothing to do with them, to make such a person to go away, for you to not have power or especially power over those you care about.
It also is a Basilisk, as discussed above.
In practice, I think this approach is hopeless.
It does not work. You don’t get people pretending to make dumb decisions. You don’t get people who can make locally dumb decisions with a global purpose, and whose overall decisions are smart.
What you get are adaptation exercisers. The type of person who acts like Putin gets rewarded, so someone with that kind of decision theory and decision process and way of being is what you get, and that person keeps getting trained on further hill climbing around that set of behaviors. This set of behaviors is then incompatible with many types of action and decision that would very much help such a person’s cause, but the selection process that got them ahead not only does not sufficiently care about that, it views such abilities with suspicion since they are Bayesian evidence this is the wrong type of person.
The mafioso playing the role of a mafioso who can actually do high level nerd stuff like maintaining complex operations (or that cares about good things, or even any neutral things) does not win because of it, they lose because of it due to the reactions of others, and this more than makes up for the advantages of being able to do the nerd stuff, so it gets trained out of them if it was ever there to begin with.
If anything, the opposite is trained – conspicuous lack of such abilities. The need to avoid both motive ambiguity and beyond that character ambiguity requires a maximally strong stance against such things, a deliberate botching of such things.
Which, of course, leads to dumb decisions.
Pretending to want bad things rather than good things is a potential alternative approach. Politicians and others do sometimes pull versions of this off, pretending until they are in position to do a Heel Face Turn, although more often they pretend long enough to stop having principles.
On top of that, if you pretend to want bad things, this in practice usually leads to also having to pretend to make dumb decisions, or else your story won’t be believable. Refusing to make dumb decisions is a lot like refusing to cause bad outcomes. It blows your cover. So now you have two problems.
The second possible approach is to shift who is rewarded for who they are and what they are expected to do.
To do this, you need to create what I call The Good Equilibrium where the type of people who benefit from what they are, are the types of people who make smart decisions and want good things.
Such an equilibrium can absolutely beself-sustaining, in the right context.
The link above talks about Chris Pikula, who was central to moving Magic: The Gathering high level competition to The Good Equilibrium. This must continuously be fought for – it is a Republic if you can keep it – but we did in fact keep it, and it was pretty great.
We had a lot of big advantages that helped us do that, such as:
- Heavy nerd stuff and complex operations were central.
- Dumb decisions were a big liability.
- An authority could kick offenders out.
- Cooperation and creation were inherently necessary to success.
- Bad actors could not credibly threaten to do much damage.
In theory, one could extend this to the world stage. And in theory, that theory is kind of being tested right now by The West.
On the margin, this test has absolutely failed, and we will continue to reward cartoon villainy. Even if Putin loses and is forced to make peace on Ukrainian terms, he still did much, much better at his goal of conquering Ukraine or his broader goal of reconstituting the Russian Empire than he would have done if he had not threatened nuclear escalation, inflicted civilian casualties, imprisoned protesters and the opposition, gone in heavily for propaganda and otherwise played his role, on top of the advantages such plays have internally in maintaining control of the Russian Federation.
A central reason this plan failed, in my model and to the extent it has indeed failed, is because the same process that leads to the actions we are rewarding is indeed incapable of accurate information flow, incentive alignment, complex operations, avoiding corruption or generally not making dumb decisions.
The central feature of escalatory strategies is that if they seem like they are working, they will continue to escalate until confronted.
Someone who is not playing such a strategy could choose a good time to stop escalating, before one is confronted, and take the win. But those whose adaptation executions are centered on escalation lack this ability. The two go hand in hand. You see this with lots of similar patterns, whenever someone ‘gets away’ with things, such as when players cheat at Magic: The Gathering. It’s no different. At each stage, the person whose character says ‘cheat’ will think they of course should cheat a little more, eventually even if the circumstances are quite risky, up until they are finally caught. ‘Looking the other way’ to avoid confrontation only postpones it.
The invasion of Ukraine also required physical-world success. Wars are proof of work, invasions are not abstractions. Shapes had to be rotated.
The same type of process has been winning control of various surrounding countries and then adventuring into Ukraine for many centuries now, and they’ve learned not to take kindly to it, which also helped a lot.
So in all these ways, the same characteristics that allowed the invasion of Ukraine to happen also prevented the attempt from being set up to succeed. So the plan failed (assuming it indeed failed).
Pick an Exit
In the decision theory model presented here, the important thing in situations like the one we are in is to be the type of decision algorithm that responds to threats of escalation, and to actual escalation, with confrontation.
Yet it must be noted that in most confrontations in my life, I do the opposite of this.
There are certainly times when I respond to confrontation and escalation with confrontation and escalation, including times when I’m looked at as dumb or crazy for doing so. I’ve burned quite a bit of value rather than put up with those who said they were altering the deal, and to prey that they do not alter it any further. I’ve turned down deals that were in my interest to take because they were ‘not good deals’ or not ‘fair’ all the time.
And I do think a lot about incentive alignment of my actions in various contexts, and take it into account.
So yes, sometimes I practice what is suggested here.
But in plenty of other contexts? Not so much. Not big on confrontation.
This requires an explanation. Why the difference?
The first big obvious difference is that mostly I am not playing iterated games with the same players unless I want to do that. If someone exploiting you leads to them learning to exploit you even more, that’s a strong reason to avoid being exploited. If someone only gets to exploit you once, and it’s an isolated exchange, that’s less worrisome.
You do still need to worry you’re part of a pattern that encourages that type of action.
One good response, when you’re not forced into repeated interactions, is simply to let them ‘win’ the current exchange through such tactics, but then to walk away and never interact with them again, and when asked about them to speak ill.
This does not work so well if you are one of a limited pool that is forced to interact – a prison, a community of nations, or even somewhere with costly exist like middle management of a corporation or a small town or family. Then you have the problem of reputation, both with them and observers, and of things continuously getting worse.
Exit is thus a big deal. If there is reasonable exit, then extractive and escalatory strategies don’t dominate in the long term because playing them changes the pool of future opponents towards other extractive and escalatory strategies.
Imagine an Iterated Prisoner’s Dilemma.
(Quick refresher: Each round, players choose Cooperate or Defect, if both Cooperate each gets $2, if only one Defects they get $3, if both defect each gets $1). Then you play more rounds. Each round, assuming no correlation between your choice and their choice, you do better playing Defect, but long-term you want to get to where both players always Cooperate and make $2/round, or do better.
Doing better means something like sometimes defecting, but getting your opponent to still always cooperate because if they don’t you’ll defect even more and it will be worse. That’s the essence of much escalatory strategy.)
Now change it by allowing exit. After playing each round, each player then chooses S (Stay) or L (Leave). If they choose stay, they play again with same player. If they choose leave, they get another random person who chose leave as their new partner.
You don’t need any advanced decision theories to know that if both players Cooperate, both players will almost always pick Stay. If both players pick Defect, it seems almost always one or both will choose Leave. If it’s one of each, it depends, but if either player Defects a bunch of times in a row they’ll be getting a new partner for sure, cause there’s nothing to lose. Any threat to go super hard on someone is met with a ‘bye then.’
You might still be able to do slightly better than $2/round, if you offer something like only defecting a small percentage of the time, and the number of total rounds is limited or scoring has a discount rate.
But mostly what happens is, the cooperators pair off with each other. All other players keep bouncing around the pool until they change their ways. A few very unlucky cooperators do very very badly, but most do well, and all the defectors do poorly.
Suppose now we introduce some exit fee to choose Leave, and notice how the dynamics change as it rises.
In most of life the fee for exit isn’t zero, but it mostly is also not so large, at least at first. So anyone who interacts with me has an actual incentive to cooperate and create value. Some do not realize this, but that’s fine, because we’ll find them out quickly.
Another difference is that observed decisions are different here than unobserved decisions. That is a strange sentence to write, because what’s the point of taking advantage of unobserved decisions if you tell everyone what you did? Now suddenly you’ve been effectively observed. That’s no good. I thought about it, and realized that in my case it was fine, because I’m sufficiently robust about caring about being sure I am not rewarding bad actors, and because the details about exactly when and how I decide what to do need not be included.
I talk about this in my old post on Privacy. This is the importance of privacy of your state of mind, your plans and your intentions. There are many times and places where it is beneficial to be open about what you are doing. Sometimes, yes, absolutely tell them what you won’t do. Other times, don’t, and let them wonder, maybe intentionally even don’t be so sure yourself.
Most people can afford to be somewhat less careful in this particular way. And we all have to choose some actions which are not incentive compatible, or would not be robust to someone fully understanding our decision policy. This is real life, not some future AI scenario against opposition that sees your source code. If you’re fully and completely unexploitable, you are being insufficiently exploited.
I would, of course, love to write about a whole bunch of particular ways one can profitably be exploitable, but there are obvious reasons why one cannot, on the public internet, do this. A shame.
The other reason, of course, is that humans cannot fully optimize, and we cultivate particular habits and become our masks. When I had to run a company and constantly deal with these questions in practice, it was super stressful, and I very much did not enjoy these aspects of the job. That doesn’t mean I would never do it again, but I would do it with my eyes open, and because the job was too important not to do.
Yet some of this is simply not being able to fully execute correct strategy in practice,again because humans have characters and emotions and limited compute and lose their nerve. I enjoy playing poker, and I’m not bad at playing poker, but I’m better at playing poker in theory (although still not professional level or anything) than I am in practice, because I am often capable of finding plays abstractly more easily than I am capable of pulling the triggers in practice. Of course, I could be wrong about that – perhaps I think I think a good game, but actually it would blow itself up.
Another way of viewing all this is that one only needs the mafioso nature when forced to deal with the mafioso nature. It is common advice, when being sent to prison, to be told to display aspects of this nature as a form of self-defense, because exit is no longer an option and thus the dynamics involved will dominate. There are also times in regular life when one must deal with such agents, and exit is not a good option – there’s too much at stake or they’ve made it impossible. At that point, one needs to respond in a sort of kind, or even better establish that you would do so, a doctrine of escalate-to-de-escalate. Whereas most of the time this is unnecessary.
Conclusion: Pick a Policy
The Mafioso Nature, and the dynamics involved in it, seem more and more important to fully understanding the world, both for Russia and in the world more generally, as I explore these concepts more, which is why I’m willing to explore it at this length. I also want to understand better how it fits in with the Maze Nature, and other similar dynamics. Is it all one thing, or not? I don’t yet know.
The central puzzle is how to design a response to such problems.
My last Ukraine post talked about various policies we might collectively adapt (for some value of ‘we’) in places like energy or immigration or regulation or taxation, that would be wins if we could have such policies. One response I got that seems valid is to point out that our society is mostly incapable of having policies as thought about in this way. There’s no mechanism to implement this sort of thing.
When we want to ‘pick a policy’ for the kinds of things discussed here – to have a better decision policy – it’s even worse. Democracies have deep difficulties making even ordinary credible commitments of limited and well-defined scope, let alone the kinds of things discussed here. What we have are various dynamics that lead to various things in ways others can hopefully predict and model, and react to in ways we hope to like.
And of course a lot of this has been driving home that character is fate. These things are difficult to fake, so one must choose character and then accept the resulting fate – there mostly is no ‘present as having the X nature and do X-style things when it’s good but secretly have the Y nature instead and use Y-based processes that are better. What you perhaps can do instead is construct and have the Z-nature (or Y-prime-nature) that has the necessary characteristics and can handle the necessary dynamics, because it’s thought it all through and built a better model. The only way out is through. Sometimes that involves incentive manipulations using things outside your control. This problem is not new, and we have developed various dynamics to defend ourselves against it.
The alternative is to allow the future to belong to men like Putin.