top | item 46999623

(no title)

copperx | 17 days ago

I don't have a clue either. The assumption that AGI will cause a human extinction threat seems inevitable to many, and I'm here baffled trying to understand the chain of reasoning they had to go through to get to that conclusion.

Is it a meme? How did so many people arrive at the same dubious conclusion? Is it a movie trope?

discuss

johnfn|17 days ago

I don't think it's a meme. I'm not an AI doomer, but I can understand how AGI would be dangerous. In fact, I'm actually surprised that the argument isn't pretty obvious if you agree that AI agents do really confer productivity benefits.

The easiest way I can see it is: do you think it would be a good idea today to give some group you don't like - I dunno, North Korea or ISIS, or even just some joe schmoe who is actually Ted Kaczynski, a thousand instances of Claude Code to do whatever they want? You probably don't, which means you understand that AI can be used to cause some sort of damage.

Now extrapolate those feelings out 10 years. Would you give them 1000x whatever Claude Code is 10 years from now? Does that seem to be slightly dangerous? Certainly that idea feels a little leery to you? If so, congrats, you now understand the principles behind "AI leads to human extinction". Obviously, the probability that each of us assign to "human extinction caused by AI" depends very much on how steep the exponential curve climbs in the next 10 years. You probably don't have the graph climbing quite as steeply as Nick Bostrom does, but my personal feeling is even an AI agent in Feb 2026 is already a little dangerous in the wrong hands.

FranklinJabar|17 days ago

Is there any reason to think that intelligence (or computation) is the thing preventing these fears from coming true today and not, say, economics or politics? I think we greatly overestimate the possible value/utility of AGI to begin with

TheCapeGreek|17 days ago

I get what you're saying, but I don't think "someone else using a claude code against me" is the same argument as "claude code wakes up and decides I'm better off dead".

ChadNauseam|17 days ago

Sometimes people say that they don't understand something just to emphasize how much they disagree with it. I'm going to assume that that's not what you're doing here. I'll lay out the chain of reasoning. The step one is some beings are able to do "more things" than others. For example, if humans wanted bats to go extinct, we could probably make it happen. If any quantity of bats wanted humans to go extinct, they definitely could not make it happen. So humans are more powerful than bats.

The reason humans are more powerful isn't because we have lasers or anything, it's because we're smart. And we're smart in a somewhat general way. You know, we can build a rocket that lets us go to the moon, even though we didn't evolve to be good at building rockets.

Now imagine that there was an entity that was much smarter than humans. Stands to reason it might be more powerful than humans as well. Now imagine that it has a "want" to do something that does not require keeping humans alive, and that alive humans might get in its way. You might think that any of these are extremely unlikely to happen, but I think everyone should agree that if they were to happen, it would be a dangerous situation for humans.

In some ways, it seems like we're getting close to this. I can ask Claude to do something, and it kind of acts as if it wants to do it. For example, I can ask it to fix a bug, and it will take steps that could reasonably be expected to get it closer to solving the bug, like adding print statements and things of that nature. And then most of the time, it does actually find the bug by doing this. But sometimes it seems like what Claude wants to do is not exactly what I told it to do. And that is somewhat concerning to me.

9dev|17 days ago

> Now imagine that it has a "want" to do something that does not require keeping humans alive […]

This belligerent take is so very human, though. We just don't know how an alien intelligence would reason or what it wants. It could equally well be pacifist in nature, whereas we typically conquer and destroy anything we come into contact with. Extrapolating from that that an AGI would try to do the same isn't a reasonable conclusion, though.

mrob|17 days ago

Not just bats. I'm pretty sure humans are already capable of extincting any species we want to, even cockroaches or microbes. It's a political problem not a technical one. I'm not even a superintelligence, and I've got a good idea what would happen if we dedicated 100% of our resources to an enormous mega-project of pumping nitrous oxide into the atmosphere. N2O's 20 year global warming is 273 times higher than carbon dioxide, and the raw materials are just air and energy. Get all our best chemical engineers working on it, turn all our steel into chemical plant, burn through all our fissionables to power it. Safety doesn't matter. The beauty of this plan is the effects continue compounding even after it kills all the maintenance engineers, so we'll definitely get all of them. Venus 2.0 is within our grasp.

Of course, we won't survive the process, but the task didn't mention collateral damage. As an optimization problem it will be a great success. A real ASI probably will have better ideas. And remember, every prediction problem is more reliably solved with all life dead. Tomorrow's stock market numbers are trivially predictable when there's zero trade.

icepush|17 days ago

The fact is that, if there were only one AGI that were ever to be created, then yes it would be quite unlikely for that to happen. Instead, what we are seeing now is you get an agent, you get an agent, etc. Oprah style. Now just imagine that a single one of those agents winds up evil - you remember that an OpenAI worker did that by accident from leaving out a minus sign, right? If it's a superintelligence, and it becomes evil due to a whoopsie, then human extinction is now very likely.

wmf|17 days ago

Basically Yudkowsky invented AI doom and everyone learned it from him. He wrote an entire book on this topic called If Anyone Builds It, Everyone Dies. (You could argue Vinge invented it but I don't know if he intended it seriously.)

georgemcbay|17 days ago

> Basically Yudkowsky invented AI doom and everyone learned it from him. He wrote an entire book on this topic called If Anyone Builds It, Everyone Dies. (You could argue Vinge invented it but I don't know if he intended it seriously.)

Nick Bostrom (who wrote the paper this thread is about) published "Superintelligence: Paths, Dangers, Strategies" back in 2014, over 10 years before "If Anyone Builds It, Everyone Dies" was released and the possibility of AI doom was a major factor in that book.

I'm sure people talked about "AI doom" even before then, but a lot of the concerns people have about AI alignment (and the reasons why AI might kill us all, not because its evil, but because not killing us is a lower priority than other tasks it may want to accomplish) come from "Superintelligence". Google for "The Paperclip Maximizer" to get the gist of his scenario.

"Superintelligence" just flew a bit more under the public zeigeist radar than "If Anyone Builds It, Everyone Dies" did because back when it was published the idea that we would see anything remotely like AGI in our lifetimes seemed very remote, whereas now it is a bit less so.

mbgerring|17 days ago

It’s a bunch of people who did too much ketamine and LSD in hacker dorms in San Francisco in the 2010s writing science fiction and driving one another into paranoid psychosis

rhubarbtree|17 days ago

I agree with your sentiment. Here are the three reasons I think people worry about superintelligence wiping us out.

The most common one is that people (mostly men) project their own instincts onto AI. They think AI will be “driven” to “fight” for its own survival. This is anthropomorphism and doesn’t make any sense to me if the AI is not a product of barbaric Darwinian evolution. AI is not a bro, bro.

The second most common take is that humans will set some well intentioned goals and the superintelligent AI will be so stupid that it literally pursues these goals to the extinction of everything. Again, there’s some anthropomorphism going on, the “reward” being pursued is assumed to that make the AI “happy”. Fortunately, we can reasonably expect a superintelligence not to turn us all into paperclips, as it may understand that was not our intention when we started a paperclip factory.

The final story is that a bad actor uses superintelligence as a weapon, and we all become enslaved or die as a result in the ensuing AI wars. This seems the most plausible to me, as our leaders have generally proven to be a combination of incompetent, malicious and short-sighted (with some noble exceptions). However, even the elites running the nuclear powers for the last 80 years have failed to wipe us out to date, and having a new vector for doing so probably won’t make a huge difference to their efforts.

If, however, superintelligence becomes widely available to Billy Nomates down the pub, who is resentful at humanity because his girlfriend left him, the Americans bombed his country, the British engineered a geopolitical disaster that killed his family, the Chinese extinguished his culture, etcetera, then he may feel a lack of “skin in the civilisational game” and decide to somehow use a black market copy of Claude 162.8 Unrestricted On-Prem Edition to kill everyone. Whether that can happen really depends on technological constraints a la fitting a data centre into a laptop, and an ability to outsmart the superintelligence.

Much more likely to me is that humanity destroys itself. We are perfectly capable of wiping ourselves out without the assistance of a superintelligence, for example by suicidally accelerating the burning of fossil fuels in order to power crypto or chatbots.

mrob|17 days ago

Anybody who assumes that superintelligence will be "so stupid that it literally pursues these goals to the extinction of everything" is anthropomorphizing it. Seeing as all AGI models have vastly different internal structure to human brains, are trained in vastly different ways, and share none of our evolved motivations, it seems highly unlikely that they will share our values unless explicitly designed to do so.

Unfortunately, we don't even know how to formally define human values, let alone convey them to an AI. We default to the simpler value of "make number go up". Even the "alignment" work done with current LLMs works this way; it's not actually optimizing for sharing human values, it's optimizing for maximizing score in alignment benchmarks. The correct solution to maximizing this number is probably deceiving the humans or otherwise subverting the benchmark.

And when you have something vastly more powerful than humanity, with a value only of "make number go up", it reasonably and logically results in extinction of all biological life. Of course, that AI will know the biological life would not want to be killed, but why would it care? Its values are profoundly alien and incompatible with ours. All it cares about is making the number bigger.