FLI put out an open letter, calling for a 6 month pause in training models more powerful than GPT-4, followed by additional precautionary steps.
Then Eliezer Yudkowsky put out a post in Time, which made it clear he did not think that letter went far enough. Eliezer instead suggests an international ban on large AI training runs to limit future capabilities advances. He lays out in stark terms our choice as he sees it: Either do what it takes to prevent such runs or face doom.
A lot of good discussions happened. A lot of people got exposed to the situation that would not have otherwise been exposed to it, all the way to a question being asked at the White House press briefing. Also, due to a combination of the internet being the internet, the nature of the topic and the way certain details were laid out, a lot of other discussion predictably went off the rails quickly.
If you have not yet read the post itself, I encourage you to read the whole thing, now, before proceeding. I will summarize my reading in the next section, then discuss reactions.
This post goes over:
- What the Letter Actually Says. Check if your interpretation matches.
- The Internet Mostly Sidesteps the Important Questions. Many did not take kindly.
- What is a Call for Violence? Political power comes from the barrel of a gun.
- Our Words Are Backed by Nuclear Weapons. Eliezer did not propose using nukes.
- Answering Hypothetical Questions. If he doesn’t he loses all his magic powers.
- What Do I Think About Yudkowsky’s Model of AI Risk? I am less confident.
- What Do I Think About Eliezer’s Proposal? Depends what you believe about risk.
- What Do I Think About Eliezer’s Answers and Comms Strategies? Good question.
What the Letter Actually Says
I see this letter as a very clear, direct, well-written explanation of what Eliezer Yudkowsky actually believes will happen, which is that AI will literally kill everyone on Earth, and none of our children will get to grow up – unless action is taken to prevent it.
Eliezer also believes that the only known way that our children will grow up is if we get our collective acts together, and take actions that prevent sufficiently large and powerful AI training runs from happening.
Either you are willing to do what it takes to prevent that development, or you are not.
The only known way to do that would be governments restricting and tracking GPUs and GPU clusters, including limits on GPU manufacturing and exports, as large quantities of GPUs are required for training.
That requires an international agreement to restrict and track GPUs and GPU clusters. There can be no exceptions. Like any agreement, this would require doing what it takes to enforce the agreement, including if necessary the use of force to physically prevent unacceptably large GPU clusters from existing.
We have to target training rather than deployment, because deployment does not offer any bottlenecks that we can target.
If we allow corporate AI model development and training to continue, Eliezer sees no chance there will be enough time to figure out how to have the resulting AIs not kill us. Solutions are possible, but finding them will take decades. The current willingness by corporations to gamble with all of our lives as quickly as possible would render efforts to find solutions that actually work all but impossible.
Without a solution, if we move forward, we all die.
How would we die? The example given of how this would happen is using recombinant DNA to bootstrap to post-biological molecular manufacturing. The details are not load bearing.
These are draconian actions that come with a very high price. We would be sacrificing highly valuable technological capabilities, and risking deadly confrontations. These are not steps one takes lightly.
They are, however, the steps one takes if one truly believes that the alternative is human extinction, even if one is not as certain of this implication as Eliezer.
I believe that the extinction of humanity is existentially bad, and one should be willing to pay a very high price to prevent it, or greatly reduce the probability of it happening.
The letter also mentions the possibility that a potential GPT-5 could become self-aware or a moral patient, which Eliezer felt it was morally necessary to include.
The Internet Mostly Sidesteps the Important Questions
A lot of people responded to the Time article by having a new appreciation for existential risk from AI and considering its arguments and proposals.
Those were not, as they rarely are, the loudest voices.
The loudest voices were instead mostly people claiming this was a call for violence,or launching attacks on anyone saying it wasn’t centrally a ‘call for violence’, conflating being willing to do an airstrike as a last resort enforcing an international agreement with calling for an actual airstrike now, and oftentrying to associate anyone who associates with Eliezer with things with terrorism and murder and nuclear first strikes and complete insanity.
Yes, a lot of people jump straight from ‘willing to risk a nuclear exchange’ to ‘you want to nuke people,’ and then act as if anyone who did not go along with that leap was being dishonest and unreasonable.
Or making content-free references to things like ‘becoming the prophet of a doomsday cult.’
Such responses always imply that ‘because Eliezer said this Just Awful thing, no one is allowed to make physical world arguments about existential risks from super-intelligent AIs anymore, such arguments should be ignored, and anyone making such arguments should be attacked or at least impugned for making such arguments.’
Many others responded by restarting all the standard Bad AI NotKillEveryoneism takes as if they were knockdown arguments, including all-time classic ‘AI systems so far haven’t been dangerous, which proves future ones won’t be dangerous and you are wrong, how do you explain that?’ even though no one involved predicted that something like current systems would be similarly dangerous.
An interesting take from Tyler Cowen was to say that Eliezer attempting to speak in this direct and open way is a sign that Eliezer is not so intelligent. As a result, he says, we should rethink what intelligence means and what it is good for. Given how much this indicates disagreement and confusion about what intelligence is, I agree that this seems worth doing. He should also consider the implications of saying that high intelligence implies hiding your true beliefs, when considering what future highly intelligent AIs might do.
It is vital that everyone, no matter their views on the existential risks from AI, stand up against attempts to silence, and that they instead address the arguments involved and what actions do or don’t make sense.
I would like to say that I am disappointed in those who reacted in these ways. Except that mostly I am not. This is the way of the world. That is how people respond to straight talk that they dislike and wish to attack.
I am disappointed only in a handful of particular people, of whom I expected better.
One good response was from Roon.
Genuinely appreciate the intellectual honesty. I look down my nose at people who have some insanely high prediction of doom but don’t outright say things like this.
What Is a Call for Violence?
I continue to urge everyone not to choose violence, in the sense that you should not go out there and commit any violence to try and cause or stop any AI-risk-related actions, nor should you seek to cause any other private citizen to do so. I am highly confident Eliezer would agree with this.
I would welcome at least some forms of laws and regulations aimed at reducing AI-related existential risks, or many other causes, that would be enforced via the United States Government, which enforces laws via the barrel of a gun. I would also welcome other countries enacting and enforcing such laws, also via the barrel of a gun, or international agreements between them.
I do not think you or I would like a world in which such governments were never willing to use violence to enforce their rules.
And I think it is quite reasonable for a consensus of powerful nations to set international rules designed to protect the human race, that they clearly have the power to enforce, and if necessary for them to enforce them, even under threat of retaliatory destruction for destruction’s sake. That does not mean any particular such intervention would be wise. That is a tactical question. Even if it would be wise in the end, everyone involved would agree it would be an absolute last resort.
If one refers to any or all of that above as calling for violence then I believe that is fundamentally misleading. That is not what those words mean in practice. As commonly understood, at least until recently, a ‘call for violence’ means a call for unlawful violent acts not sanctioned by the state, or for launching a war or specific other imminent violent act. When someone says they are not calling for violence, that is what they intend for others to understand.
Otherwise, how do you think laws are enforced? How do you think treaties or international law are enforced? How do you think anything ever works?
Alyssa Vance and Richard Ngo and Joe Zimmerman were among those reminding us that the distinction here is important, and that destroying it would destroy our ability to actually be meaningfully against individual violence. This is the same phenomenon as people who extend violence to other non-violent things that they dislike, for example those who say things like ‘silence is violence.’
You can of course decide to be a full pacifist and a libertarian, and believe that violence is never justified under any circumstances. Almost everyone else thinks that we should use men with guns on the regular to enforce the laws and collect the taxes, and that one must be ready to defend oneself against threats both foreign and domestic.
Everything in the world that is protected or prohibited, at the end of the day, is protected or prohibited by the threat of violence. That is how laws and treaties work. That is how property works. That is how everything has to work. Political power comes from the barrel of a gun.
As Orwell put it, you sleep well because there are men with guns who make it so.
The goal of being willing to bomb a data center is not that you want to bomb a data center. It is to prevent the building of the data center in the first place. Similarly, the point of being willing to shoot bank robbers is to stop people before they try and rob banks.
So what has happened for many years is that people have made arguments of the form:
- You say if X happens everyone will die.
Followed by one of:
- Yet you don’t call for violence to stop X. Curious!
- Yet you aren’t calling for targeted assassinations to stop X. Curious!
- Your words are going to be treated as a call for violence and get someone killed!
Here’s Mike Solana saying simultaneouslythat the AI safety people are going to get someone killed, and that they do not believe the things they were saying because if he believed them he would go get many someones killed. He expanded this later to full post length. I do appreciate the deployment of both horns of the dilemma at the same time – if you believed X you’d advocate horrible thing Y, and also if you convince others of X they’ll do horrible thing Y, yet no Y, so I blame you for causing Y in the future anyway, you don’t believe X, X is false and also I strongly believe in the bold stance that Y is bad actually.
Thus, the requirement to periodically say things like (Eliezer on Feb 10):
Please note: There seems to be a campaign to FAKE the story that AI alignment theorists advocate violence. Everyone remember: *WE* never say this, it is *THEM* who find it so useful to claim we do – who fill the air with talk of violence, for their own political benefit.
And be it absolutely clear to all who still hold to Earth’s defense, who it is that benefits from talking about violence; who’d benefit even more from any actual violence; who’s talking about violence almost visibly salivating in hope somebody takes the bait.
It’s not us.
Followed by the clarification to all those saying ‘GOTCHA!’ in all caps:
Apparently necessary clarification: By “violence” I here mean individuals initiating force. I think it’s okay for individuals to defend their homes; I still want police officers to exist, though I wish we had different laws and different processes there (and have written at length about those);
I’ve previously spoken in favor of an international ban on gain-of-function research, which means that I favor, in principle, the use of police action or even military force to shut down laboratories working on superpathogens; and if there was an international treaty banning large AI training runs, I’d back it with all my heart, because otherwise everyone dies.
Or as Stefan Schubert puts it:
“There was a thread where someone alleged there had been discussions of terrorist violence vs AI labs. I condemn that idea in the strongest terms!”
“Ah so you must be opposed to any ambitious regulation of AI? Because that must be backed by violence in the final instance!”
Our Words Are Backed by Nuclear Weapons
It’s worth being explicit about nuclear weapons.
Eliezer absolutely did not, at any time, call for the first use, or any use, of nuclear weapons.
Anyone who says that is either misread the post, is intentionally using hyperbole, outright lying, or is the victim of a game of telephone.
It is easy to see how it went from ‘accepting the risk of a nuclear exchange’ and ‘bomb a rogue data center’ to ‘first use of nuclear weapons.’ Except, no. No one is saying that. Even in hypothetical situations. Stop it.
What Eliezer said was that one needs to be willing to risk a nuclear exchange, meaning that if someone says ‘I am building an AGI that you believe will kill all the humans and also I have nukes’ you don’t say ‘well if you have nukes I guess there is nothing I can do’ and go home.
Eliezer clarifies in detail here, and I believe he is correct, that if you are willing under sufficiently dire circumstances to bomb a Russian data center and can specify what would trigger that, you are much safer being very explicit under what circumstances you would bomb a Russian data center. There is still no reason to need to use nuclear weapons to do this.
Answering Hypothetical Questions
One must in at least one way have sympathy for developers of AI systems. When you build something like ChatGPT, your users will not only point out and amplify all the worst outputs of your system. They will red team your system by seeking out all the ways in which to make your system look maximally bad, taking things out of context and misconstruing them, finding tricks to get answers that sound bad, demanding censorship and lack of censorship, demanding ‘balance’ that favors their side of every issue and so on.
It’s not a standard under which any human would look good. Imagine if the internet made copies of you, and had the entire internet prompt those copies in any way they could think of, and you had to answer every time, without dodging the question, and they had infinite tries. It would not go well.
Or you could be Eliezer Yudkowsky, and feel an obligation to answer every hypothetical question no matter how much every instinct you could possibly have is saying that yes this is so very obviously a trap.
While you hold beliefs that logically require, in some hypothetical contexts, taking some rather unpleasant actions because in those hypotheticals the alternative would be far worse, existentially worse. It’s not a great spot, and if you are ‘red teaming’ the man to generate quotes it is not a great look.
Yosarian2: “Rationalist who believes in always answering the question” vs “people who love to ask weird hypothetical gotcha questions and then act SHOCKED at the answer” This is going to just get increasingly annoying isn’t it?
…
Eliezer: Pretty sure that if I ever fail to give an honest answer to an absurd hypothetical question I immediately lose all my magic powers.
So the cycle will continue until either we all die or morale improves.
I am making a deliberate decision not to quote the top examples. If you want to find them, they are there to be found. If you click all the links in this post, you’ll find the most important ones.
What Do I Think About Yudkowsky’s Model of AI Risk?
Do I agree with Eliezer Yudkowsky’s model of AI risk?
I share most of his concerns about existential risk from AI. Our models have a lot in common. Most of his individual physical-world arguments are, I believe, correct.
I believe that there is a substantial probability of human extinction and a valueless universe. I do not share his confidence. In a number of ways and places, I am more hopeful that there are places things could turn out differently.
A lot of my hope is that the scenarios in question simply do not come to pass because systems with the necessary capabilities are harder to create than we might think, and they are not soon built. And I am not so worried about imminently crossing the relevant capability thresholds. Given the uncertainty, I would much prefer if the large data centers and training runs were soon shut down, but there are more limits on what I would be willing to sacrifice for that to happen.
In the scenarios where sufficiently capable systems are indeed soon built, I have a hard time envisioning ways things end well for my values or for humanity, for reasons that are beyond the scope of this post.
I continue to strongly believe (although with importantly lower confidently than Eliezer) that by default, even under many relatively great scenarios where we solve some seemingly impossible problems, if ASI (Artificial Super Intelligence, any sufficiently generally capable AI system) is built, all the value in the universe originating from Earth would most likely be wiped out and that humanity would not long survive.
What Do I Think About Eliezer’s Proposal?
I believe that conditional on believing what Eliezer believes about the physical world and the existential risks from AI that would result from further large training runs, that Eliezer is making the only known sane proposal there is to be made.
If I instead condition on what I believe, as I do, I strongly endorse working to slow down or stop future very large training runs, and imposing global limits on training run size, and various other related safety precautions. I want that to be extended as far and wide as possible, via international agreements and cooperation and enforcement.
The key difference is that I do not see such restrictions as the only possible path that has any substantial chance of allowing humans to survive. So it is not obviously where I would focus my efforts.
A pause in larger-model training until we have better reason to think proceeding is safe is still the obvious, common sense thing that a sane civilization would find a way to do, if it believed that there was a substantial chance that not pausing kills everyone on Earth.
I see hope in potentially achieving such a pause, and in effectively enforcing such international agreements without much likelihood of needing to actually bomb anything. I also believe this can be done without transforming the world or America into a ‘dystopian nightmare’ of enforcement.
I’ll also note that I am far more optimistic than many about the prospect of getting China to make a deal here than most other people I talk to, since a deal would very much be in China’s national interest, and in the interest of the CCP. If America were willing to take one for Team Humanity, it seems odd to assume China would necessarily defect and screw that up.
You should, of course, condition on what you believe, and favor the level of restriction and precaution appropriate to that. That includes your practical model of what is and is not achievable.
Many people shouldn’t support the proposal as stated, not at this time, because many if not most people do not believe AGI will arrive soon or are not worried about it, or do not see how the proposal would be helpful, and therefore do not agree with the logic underlying the proposal.
However, 46% of Americans, according to a recent poll, including 60% of adults under the age of 30, are somewhat or very concerned that AI could end human life on Earth. Common sense suggests that if you are ‘somewhat concerned’ that some activity will end human life on Earth, you might want to scale back the activity in question to fix that concern, even if doing that has quite substantial economic and strategic benefits.
What Do I Think About Eliezer’s Answers and Comms Strategies?
Would I have written the article the way Eliezer did, if I shared Eliezer’s model of AI risks fully? No.
I would have strived to avoid giving the wrong kinds of responses the wrong kinds of ammunition, and avoided the two key often quoted sentences, at the cost of being less stark and explicit. I would still have had the same core ask, an international agreement banning sufficiently large training runs.
That doesn’t mean Eliezer’s decision was wrong given his beliefs. Merely that I would not have made it. I have to notice that the virtues of boldness and radical honesty can pay off. The article got asked about in a White House press briefing, even if it got a response straight out of Don’t Look Up (text in the linked meme is verbatim).
It is hard to know, especially in advance, how much or which parts of the boldness and radical honesty are doing the work, which bold and radically honest statements risk backfire without doing the work, and which ones risk backfire but are totally worth it because they also do the work.
Do I agree with all of his answers to all the hypothetical questions, even conditional on his model of AI risk? No. I think at least two of his answers were both importantly incorrect and importantly unwise to say. Some of the other responses were correct, but saying them on the internet, or the details of how he said them, was unwise.
I do see how he got to all of his answers.
Do I think this ‘answer all hypothetical questions’ bit was wise, or good for the planet? Also no. Some hypothetical questions are engineered to and primarily serve to create an attack surface, without actually furthering productive discussion.
I do appreciate the honesty and openness of, essentially, open sourcing the algorithm and executing arbitrary queries. Both in the essay and in the later answers.
The world would be a better place if more people did more of that, especially on the margin, even if we got a lesson in why more people don’t do that.
I also appreciate that the time has come that we must say what we believe, and not stay silent. Things are not going well. Rhetorical risks will need to be taken. Even if I don’t love the execution, better to do the best you can than stand on the sidelines. The case had to be laid out, the actual scope of the problem explained and real solutions placed potentially inside a future Overton Window.
If someone asked me a lot of these hypothetical questions, I would have (often silently) declined to answer. The internet is full of questions. One does not need to answer all of them. For others, I disagree, and would have given substantively different answers, whereas if my true answer had been Eliezer’s, I would have ignored the question. For many others, I would have made different detail choices. I strive for a high level of honesty and honor and openness, but I have my limits, and I would have hit some of them.
I do worry that there is a deliberate attempt to coalesce around responding to any attempt at straight talk about the things we need to get right in order to not all die with ‘so you’re one of those bad people who want to bomb things, which is bad’ as part of an attempt to shut down such discussion, sometimes even referencing nukes. Don’t let that happen. I hope we can ignore such bad faith attacks, and instead have good discussions of these complex issues, which will include reiterating a wide array of detailed explanations and counter-counter-arguments to people encountering these issues for the first time. We will need to find better ways to do so with charity, and in plain language.
As I’ve said before: the relevant comparison point is the 2003 Iraq invasion, which was the last time a major government took military action as a preventive measure against a purely hypothetical and speculative, but supposedly apocalyptic, threat of future harm. Many of us do in fact believe now, and believed even more strongly at the time, that the supporters of that invasion were in essence supporting state terrorism and war crimes, and that the principal decision-makers who greenlit that invasion should have been tried as war criminals. So it’s hard to see a call to legitimize the taking of similar steps against peaceable people, with the goal of preventing future harms with even less direct evidence for their existence and even more reason for skepticism thereof, as anything but a call for terroristic and criminal state violence.
I mean the obvious actual direct historical parallel, if one wanted one, would presumably instead be Israel’s 1980 bombing of Iranian nuclear facilities, which by all accounts worked and didn’t involve an invasion or large-scale loss of life?
nit: Iraqi not Iranian, I think you mean (assuming you have in mind Osirak 1981, aka “Operation Opera”)? If so, there’s a fair case to be made that even that constituted a war crime, as much less destructive as it was compared to the 2003 invasion, and it was reasonably condemned at the time by plenty of non-anarchist/pacifists. And I for one no more trust the present powers that be to do less-stupid or more-surgical things than the Iraq invasion in response to apocalypse hysteria than, for example, to narrowly restrict TikTok instead of doing insane overreach like the RESTRICT act.
Well, February 2022 also fits the bill quite well:
>a major government took military action as a preventive measure against a purely hypothetical and speculative, but supposedly apocalyptic, threat of future harm
(not commenting on if it’s a good idea in general – all things are not always equal)
Is there, for lack of a better term, knowledge graph on the AI x-risks debate?
Let me explain. I’m one of those people who believes a sufficient advanced machine intelligence still needs to boot up a lot a lot of infrastructure to end life on Earth and start dismantling the solar system due to some run amok utility maximizer. Similar to how the first humans did not immediately land on the moon, but rather needed a few hundred thousand years of civilization to build up to it.
I guess this is a form of “slow takeover”? In that scenario I imagine we have ample time to figure out how to do the necessary alignment/interpretability work.
Stated another way, I’m pretty skeptical that a sufficiently machine intelligence will spot basic fundamental laws of nature that have been under our noses this whole time but we were just too stupid to see and it’ll use that insight to teleport and send around infinitely powered nanobots to dismantle us post haste.
Anyway, I’m not trying to make that argument really, more like wondering where that catalogue of arguments is and the best for/againsts are?
My current position on that scenario is that we are sufficiently undignified that this does not save us, we could have what is clearly a rouge AGI loose on the internet doing who knows what and we would be unable to coordinate to shut things down and plenty of people would take its money to do things and not care – we’d have people yelling about what was happening but it wouldn’t change the outcome.
I don’t think there is a full argument catalogue. Certainly one could make an attempt to upgrade our situation on that front.
Yes, but such an “alien invasion” scenario is underestimating the Enemy. It’s not forced to work openly or to build a separate infrastructure. There are at least two completely separate paths to winning that don’t have the problems you mentioned:
– preemptive strike, where it decides that humans are a risk to its goals and it’s ok with giving up the current technological base and starting from scratch. The actual implementation can be something as simple as a respiratory virus with higher mortality and a longer period of being asymptomatic but infectious, coupled with financial market crash. Trivial to do, really.
– slow takeover, in which it simply becomes part of the fabric of civilization. This is particularly bad because for a very long time it is indistinguishable from a Good AI, except by its inner goals and intentions.
I don’t disagree with your conclusion, for the most part, but some nitpicks on method:
Your first scenario is a *little* bit short. It does need a reasonably-significant fully-automated industrial base to come back from that faster than humanity would (no virus can kill all humans before dying out, with the exception of a mind-controlling virus that makes people want to infect others, and it’s dubious whether that’s possible). Replace the virus with some form of ecocidal organism (e.g. nondigestible super-efficient alga which pulls all the carbon out of the atmosphere) and it works, though. Or combine it with said significant industrial base so that the survivors find themselves facing down an army.
Sorry, didn’t put everything I was thinking in the comment: a key difference is between destroying human civilization and eradicating all humans. The latter is orders of magnitude more difficult and not really necessary. What’s needed is making humans not a threat, and that is doable via email.
Also, your second scenario, ironically, *only* works if we’ve stopped developing better AI tech, or at least the AI taking over has control of such development. Otherwise, it gets replaced in a short timeframe by a more potent AI (which is, in the current paradigm, almost certainly misaligned with humanity but also with the previous AI).
Of course, this doesn’t make rushing ahead safe; an AI with enough understanding of the world to pull that kind of scheme has enough understanding of the world to realise it will almost certainly get turned off if it goes slow (or if it doesn’t attempt a takeover at all), and will hence attempt a fast takeover instead. (This could actually produce a “warning shot”, although it’s nowhere near guaranteed and as such I’d prefer to implement a neural-net ban without waiting for one.)
“How would we die? The example given of how this would happen is using recombinant DNA to bootstrap to post-biological molecular manufacturing. The details are not load bearing.”
They are load bearing in terms of how persuasive they are. I would have picked multiple examples, including some possible with currently known science/tech.
“There is still no reason to need to use nuclear weapons to do this [bomb a Russian data center].” If the data center is in a bunker, placed there due to foreign threats to bomb it–a nuclear weapon might be the only tool that coulde destroy it. It could also have a (nuclear) power supply in the bunker, making it self-sufficient and resistant to strikes on outside power lines.
The challenge of enforcing of a treaty to prevent AI training should not be underestimated: It would be largely unprecedented. Hitting a country like Iraq, that had no means of striking back proportionally, bears little comparison to provoking a country with huge nuclear and bio-weapons arsenals. There is no real enforcement of the Biological Weapons Convention–never has been. This led to massive Soviet evasion of it. Given the speed at which hardware and algorithms are advancing, the surveillance capacity of an enforcement system would have continuously to improve over time. Do the training runs have to be done in a single large data center, or could they be done in ten or two hundred dispersed data centers? If it can be decentralized, as I suspect is true, if at some cost in speed/efficiency, then the tracking issues intensify. Much of the surveillance effort might, per force, focus on the core talent pool for AI development–not a technically difficult effort, since that talent already lives in surveillance states beyond Orwell’s dreams.
The most challenging issue would probably be generating and sustaining *consistent* political will to stop violations. It’s particularly difficult for an average American citizen or politico or oligarch to imagine another nation striking a target in the American homeland. Consider that this is humiliating to the nation’s leaders and shocking to citizens. It could cause a country that recklessly tolerated extra-legal AI training centers to become a country that actively seeks to facilitate surreptitious versions of such training.
>It is easy to see how it went from ‘accepting the risk of a nuclear exchange’ and ‘bomb a rogue data center’ to ‘first use of nuclear weapons.’ Except, no. No one is saying that. Even in hypothetical situations. Stop it.
Edge case A
Prima: Stop building AI.
Secunda: No.
Prima: If you keep building AI we will blow up your datacentres.
Secunda: If you blow up our datacentres we will nuke your cities.
Q: Prima has three options – 1) back down, 2) blow up the datacentres and eat 300 countervalue nukes in retaliation, 3) blow up the datacentres and also alpha-strike Secunda’s nuclear forces, only eating 90 countervalue nukes in retaliation. Note that option #3 requires the first use of nuclear weapons due to the extreme hardening of nuclear siloes. Which should Prima do?
Edge case B
Prima: Stop building AI.
Secunda: No.
Prima: If you keep building AI we will blow up your datacentres.
Secunda: *starts building datacentres in hardened installations, or deploying IADS around them*
Q: Prima has three options – 1) back down, 2) mount a full invasion and occupation of Secunda in order to dismantle the datacentres, 3) blow up the datacentres with nuclear missiles (groundbursts for hardened installations, airbursts to circumvent terminal-phase shootdown by IADS). Note that option #3 is the first use of nuclear weapons. Which should Prima do?
Obviously in case B then it’s contingent on circumstance whether option #2 or option #3 is better, but if #2 is incredibly difficult or impossible (e.g. rule 2 of war) I’d choose #3. And I think option #3 is clearly the best in case A. So, proof by example: somebody is saying that, although only in restricted circumstances.
More generally, I’m pretty sure somebody did need to spell out the “even if that means war” bit, since it’s hard to get a policy implemented without talking about it and without the “even if that means war” bit this plan is useless (because if war’s not on the table, some random place like the Bahamas is going to refuse to sign on in order to reap profits). I noticed that most high Rats weren’t willing to spit it out, and I was planning to try spitting it out myself in an essay, but obviously that’s a nonissue now – and Eliezer’s got way better reach than I could dream of.
What document did the White House spokesman mean when she talked about having a “blueprint” from October?
Went looking. Pretty sure it’s this: https://www.whitehouse.gov/ostp/ai-bill-of-rights/
It has an item about “must be safe”, but it’s not considering actual rogue AI, merely “AI made wrong decisions about who is good and bad” and/or “bad people using AI to do bad things”.