AI #14: A Very Good Sentence

“Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.”

That is the entire text of the one-line open letter signed this week by what one could reasonably call ‘everyone,’ including the CEOs of all three leading AI labs.

Major news outlets including CNN and The New York Times noticed, and put the focus squarely on exactly the right thing: Extinction risk. AI poses an extinction risk.

This time, when the question was asked at the White House, no one laughed.

You love to see it. It gives one hope.

Some portion of we are, perhaps, finally ready to admit we have a problem.

Let’s get to work.

Continue reading
Posted in Uncategorized | 4 Comments

To Predict What Happens, Ask What Happens

When predicting conditional probability of catastrophe from loss of human control over AGI, there are many distinct cruxes. This essay does not attempt a complete case, or the most generally convincing case, or addressing the most common cruxes.

Instead these are my best guesses for potentially mind-changing, armor-piercing questions people could ask themselves if they broadly accept many concepts like power seeking being a key existential risk, that default development paths are likely catastrophic and that AI could defeat all of us combined, have read and thought hard about alignment difficulties, yet think the odds of catastrophe are not so high.

In addition to this entry, I attempt an incomplete extended list of cruxes here, an attempted taxonomy of paths through developing AGI and potentially losing control here, and an attempted taxonomy of styles of alignment here, while leaving to the future or others for now a taxonomy of alignment difficulties.

Apologies in advance if some questions seem insulting or you rightfully answer with ‘no, I am not making that mistake.’ I don’t know a way around that.

Here are the questions up front:

  1. What happens?
  2. To what extent will humanity seek to avoid catastrophe?
  3. How much will humans willingly give up, including control?
  4. You know people and companies and nations are dumb and make dumb mistakes constantly, and mostly take symbolic actions or gesture at things rather than act strategically, and you’ve taken that into account, right?
  5. What would count as a catastrophe?
  6. Are you consistently tracking what you mean by alignment?
  7. Would ‘human-strength’ alignment be sufficient?
  8. If we figure out how to robustly align our AGIs, will we choose to and be able to make and keep them that way? Would we keep control?
  9. How much hope is there that a misaligned AGI would choose to preserve humanity once it no longer needed us?
  10. Are you factoring in unknown difficulties and surprises large and small that always arise, and in which direction do they point? re you treating doom as only happening through specified detailed logical paths, which if they break down mean it’s going to be fine?
  11. Are you properly propagating your updates, and anticipating future updates?
  12. Are you counting on in-distribution heuristics to work out of distribution?
  13. Are you using instincts and heuristics rather than looking at mechanics, forming a model, doing math, using Bayes Rule?
  14. Is normalcy bias, hopeful thinking, avoidance of implications or social cognition subtly influencing your analysis? Are you unconsciously modeling after media?
Continue reading
Posted in Uncategorized | Leave a comment

Types and Degrees of Alignment

What would it mean to solve the alignment problem sufficiently to avoid catastrophe? What do people even mean when they talk about alignment?

The term is not used consistently. What would we want or need it to mean? How difficult and expensive will it be to figure out alignment of different types, with different levels of reliability? To implement and maintain that alignment in a given AGI system, including its copies and successors?

The only existing commonly used terminology whose typical uses are plausibly consist is the contrast between Inner Alignment (alignment of what the AGI inherently wants) and Outer Alignment (alignment of what the AGI provides as output). It is not clear this distinction is net useful.

Continue reading
Posted in Uncategorized | 2 Comments

Stages of Survival

This post outlines a fake framework for thinking about how we might navigate the future. I found it useful to my thinking, hopefully you will find it useful as well.

Continue reading
Posted in Uncategorized | Leave a comment

The Crux List

This is a linkpost for The Crux List. The original text is included as a backup, but it formats much better on Substack, and I haven’t yet had time to re-format it for WordPress or LessWrong.

Continue reading
Posted in Uncategorized | 2 Comments

AI #13: Potential Algorithmic Improvements

At least two potentially important algorithmic improvements had papers out this week. Both fall under ‘this is a well-known human trick, how about we use that?’ Tree of Thought is an upgrade to Chain of Thought, doing exactly what it metaphorically sounds like it would do. Incorporating world models, learning through interaction via a virtual world, into an LLM’s training is the other. Both claim impressive results. There seems to be this gigantic overhang of rather obvious, easy-to-implement ideas for improving performance and current capabilities, with the only limiting factor being that doing so takes a bit of time.

That’s scary. Who knows how much more is out there, or how far it can go? If it’s all about the algorithm and they’re largely open sourced, there’s no stopping it. Certainly we should be increasingly terrified of doing more larger training runs, and perhaps terrified even without them.

The regulation debate is in full swing. Altman and OpenAI issued a statement reiterating Altman’s congressional testimony, targeting exactly the one choke point we have available to us, which is large training runs, while warning not to ladder pull on the little guy. Now someone – this means you, my friend, yes you – need to get the damn thing written.

The rhetorical discussions about existential risk also continue, despite moral somewhat improving. As the weeks go by, those trying to explain why we might all die get slowly better at navigating the rhetoric and figuring out which approaches have a chance of working on which types of people with what background information, and in which contexts. Slowly, things are shifting in a saner direction, whether or not one thinks it might be enough. Whereas the rhetoric on the other side does not seem to be improving as quickly, which I think reflects the space being searched and also the algorithms being used to search that space.

Continue reading
Posted in Uncategorized | 19 Comments

Papers, Please #1: Various Papers on Employment, Wages and Productivity

For a while, I’ve been keeping a bookmark folder called ‘Papers, Please’ of all the papers I’d like to check out in the future. For those I do get to look at, I’ve compiled my observations, with the intent of making this another kind of roundup. I noticed a bunch of them were focused on questions of employment, wages and productivity, so it made sense to pull those out into a post, and stay on the lookout for similar groupings in the future as the section expands.

Continue reading
Posted in Uncategorized | 5 Comments

AI #12:The Quest for Sane Regulations

Regulation was the talk of the internet this week. On Capital Hill, Sam Altman answered questions at a Senate hearing and called for national and international regulation of AI, including revokable licensing for sufficiently capable models. Over in Europe, draft regulations were offered that would among other things de facto ban API access and open source models, and that claims extraterritoriality.

Capabilities continue to develop at a rapid clip relative to anything else in the world, while being a modest pace compared to the last few months. Bard improves while not being quite there yet, a few other incremental points of progress. The biggest jump is Anthropic giving Claude access to 100,000 tokens (about 75,000 words) for its context window.

Continue reading
Posted in Uncategorized | 10 Comments

AI #11: In Search of a Moat

Remember the start of the week? That’s when everyone was talking about a leaked memo from a Google employee, saying that neither Google nor OpenAI had a moat and the future belonged to open source models. The author was clearly a general advocate for open source in general. If he is right, we live in a highly doomed world.

The good news is that I am unconvinced by the arguments made, and believe we do not live in such a world. We do still live in more of such a world than I thought we did a few months ago, and Meta is very much not helping matters. I continue to think ‘Facebook destroys world’ might be the most embarrassing way to go. Please, not like this.

By post time, that was mostly forgotten. We were off to discussing, among other things, constitutional AI, and Google’s new product announcements, and an avalanche of podcasts.

Continue reading
Posted in Uncategorized | 13 Comments

Housing and Transit Roundup #4

It’s time for another housing roundup, so I can have a place to address the recent discussions about the local impact of housing construction on housing costs.

Continue reading
Posted in Uncategorized | 5 Comments