Simplifying the AI Doomer argument

Unpacking Eliezer Yudkowsky's argument to provide a framework for discussion

Apr 20, 2023

OpenAI’s ChatGPT is the fastest-growing product of all time, getting to 100m users in a matter of months. Just as quickly, the calls to regulate artificial intelligence (AI)—or stop development altogether—have reached a fevered pitch. Sundar Pichai, the CEO of Google, called for regulation on 60 minutes and Elon Musk signed a letter with whole host of people to pause AI development for 6 months.

Why are so many technologists are calling for us to slow down on AI?

My general frame for thinking about technology progress is permissionless innovation. Let the innovators innovate, let capital flow the to most promising projects, and let consumers and businesses decide what products they think add value. Let them be and order will emerge.

I also distrust grandiose technology predications. Technology is a tool, not a product. How a tool is used is rarely obvious a priori, the solution only comes through experimentation. When we try to predict the future, we’re often hilariously wrong about the downsides and miss the disruptive potential. Railroads didn’t cause cows to miscarriage (we did get the rise of the modern city) and video games didn’t cause people to be more violent (we did get GPUs, critical for AI). We should be humble in our prognostications.

To be fair to the AI Safety crowd and the Doomers, AI might be different in kind than previous technologies, as I noted in piece a few weeks ago. Technologies like writing, the printing press, and the internet allowed humans to extract knowledge from our brains and make it extant in the physical world. We have created insane amounts of data, but the only way to create intelligence was to have a baby. Until now.

The journey towards artificial intelligence began with the first computers. So far, they have been so far pretty dumb compared to humans. Sure, they could do calculations way faster than us, but they couldn’t think like us and certainly were clunky (at best) with the primary substrate of human thought, language. Now that we can create artificial intelligences, we can make a lot of them. Information is abundant already, what happens when intelligences are abundant?

I get why people have trepidations. But, it’s one thing to slow down AI, it’s another to stop it entirely.

I was shocked to read an article in Time by Eliezer Yudkowsky which called for an immediate and indefinite end to AI development until we can figure out how to align AI with humanity. How might we shut AI down? Eliezer is refreshingly honest with his stark solution: not only should we ban training of models that are more powerful than ChatGPT-4, we should restrict and regulate the GPUs on which those models are trained, we should build an international coalition, and any country or actor who violates those conventions will be attacked. In other words, we should treat any actor building AI like we treat Iran when it refines fissile material: as a rogue actor and a threat to humanity.

Wowza! Extraordinary claims require extraordinary evidence. To better understand Eliezer’s reasoning, I decided to listen to an excellent podcast (The Lunar Society, by Dwarkesh Patel) where Eliezer spends 4 hours (!) discussing his position, which unfortunately often boils down to “I am very smart, I have been thinking about AI alignment for 20 years and I haven’t figured it out. What I have figured out is that all known roads lead to the elimination of humankind by AIs, so we have to stop development.” Not a very satisfying argument, to say the least.

Given that he seems like a smart guy, I thought perhaps he had elucidated his argument in writing. There are two links from the Time article. He warns us in the beginning of the longer, more rambling article, that I wasn’t going to find what I wanted:

I have several times failed to write up a well-organized list of reasons why AGI will kill you… Having failed to solve this problem in any good way, I now give up and solve it poorly with a poorly organized list of individual rants. I'm not particularly happy with this list; the alternative was publishing nothing, and publishing this seems marginally more dignified.

Since I couldn’t find a clear, distilled version of his argument from him, I decided to try my own hand at i:

Proposition 1: Our current AI research path will lead to artificial general intelligence (AGI).
Proposition 2: Unless we specifically design AGI to be aligned, it will be unaligned.
Proposition 3: We don’t know how to align AI.
Proposition 4: Unaligned AGI will destroy humanity with 100% probability.
Ergo 1: we need to stop development on AI immediately.
Ergo 2: we need to put all of our resources into aligning AI if we want to restart development.

Let’s take the propositions one-by-one.

I agree with Eliezer on Proposition 1, that we are careening towards AGI, for two reasons, one technical and one philosophical.

On the technical side, it seems the transformer architecture underlying LLMs is a huge leap forward. We’ve had a number of significant advances in AI in recent decades, like convolutional neural nets (CNNs) and general adversarial networks (GANs). We’ve pushed those techniques to solve big problems, like protein folding and beating humans in Go. But few worried about a CNN surpassing human intelligence last year.

In comes OpenAI. GPT-2, released at the end of 2019, though impressive, was a novelty for nerds. When ChatGPT (3.5) was released to the public in November 2022, it caused a sensation because it worked so well. Then, ChatGPT-4 was a noticeable improvement over v3 just a few months later. There’s good reason to believe that ChatGPT-5 will be a significant improvement over its predecessor. And so on, suggesting a steep slope of improvement. Also, everyone’s joining the race including Google, Elon Musk, and lots of startups. Tech, science, capital and talent flocking to AI, so expect to see some fireworks over the next few years.

Second, and I think more importantly, is that LLMs understand language. Yeah, we’ve had AI techniques that can do decent translations from one language to another, deliver you killer product recommendations, and get you addicted to TikToks, but never before have we had a computer we can actually talk to. Descartes said “I think, therefore I am,” but he could have as well said “I speak, therefore I am,” since all human thought is communicated to others via language. (Relatedly, there is no private language.) Once computers start conversing with us, thinking isn’t far behind.

The timeline for Proposition 1 matters: will ChatGPT-5 be sentient or will it be ChatGPT-25 in a decade?

If we’ve got a decade it would be foolish to stop now, since we would leave an enormous amount of value on the table. Startups are popping up left and right to take advantage of this new capability of computers, with huge potential to make our lives much better. If AGI isn’t going to happen for a decade, let’s keep improving the underlying system. An interesting Eliezer counter is that we must stop now, because it’s impossible to know when AI becomes super-intelligent, because a super-intelligent being might know how to shroud its intelligence and fool the wetware in our brains.

Proposition 1 is a doozy.

Proposition 2 says that AI will be unaligned unless we design it to be aligned. Whoa there, buddy! Now we have to define “alignment.” Much ink has been spilled on alignment, perhaps too much, like this wordy tweet from Eliezer that doesn’t provide a simple definition. ChatGPT, however, was happy to write something:

I suppose we don’t want the AI to become evil like in movies and kill us all. But, how would we know? “Aligned with human values” is a naïve definition. Defining shared human values is quite difficult, even for moral philosophers. Plus, we are talking here about a superintelligence, not a human being whose motivations and ethics we can assess in the traditional way.

Proposition 2 also requires that the AI is designed to be aligned. I have a negative gut reaction to the idea that alignment needs to be built into the system. Why couldn’t it be an emergent property?

Proposition 3 is something that is certainly true: we do not know how to align AGI with humanity and we don’t even know how to test whether an AGI is aligned. These seem like extremely valuable areas to research.

Proposition 4 is that unaligned AGI will destroy humanity with 100% probability. Why is Eliezer so confident? From the podcast:

“From my perspective, I’m not making any brilliant startling predictions, I’m poking at other people’s incorrectly narrow theories until they fall apart into the maximum entropy state of doom.”

In other words, the way he comes to 100% probability is that he disproves other theories of an aligned AI, and because he’s able to strike them all down, he infers that doom is inevitable. Why doesn’t he think that someone might come along with a new theory?

There are examples of aligned AGIs in fiction. An example is Iain M. Banks’ Culture Series, a set of SciFi novels set in the far future, where humanity is taken care of by a god-like AIs called “The Minds.” These AIs have access to near-infinite energy, can create virtually any materials, build anything, and are super super intelligent. And they are moral agents. A key tenet of The Minds is that they hold sentient life sacred and see it as their mission to preserve other sentient things, like the humans they take care of. Did the members of the Culture program their AI to be aligned or did it emerge aligned?

There’s still a lot tot unpack, but I do think I captured the core argument accurately and was able to give you a bit of color on each of the major propositions.

At the end of the day, Eliezer is extremely thorough and extremely verbose, but fails at the basic requirement of presenting a simple, straightforward argument when asking society to restructure around his thesis. What I tried to do above is simplify his argument into something that will guide my further research into AGI and alignment and form my own thesis.

There’s a legitimate fear that the culmination of centuries of science will result in intelligence greater than any human and greater than all humans—a form of alien intelligence that we humans unleash on this world with dire consequences.

And yet, I’m not a Doomer. I’m still very hopeful and believe that we should both push full-steam ahead and start asking some really tough and philosophical questions about AI.

I hope that someone who understands Eliezer (or maybe the man himself) reads this and critiques my distillation.

p.s., there’s a whole ‘nother post to write about the AI safety people. I still can’t get over how bone-headed this week’s All-In discussion was on inviting the government into the approval process for AI, based on Chamath’s tweet.

Mark’s Substack

Discussion about this post