Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Adumbrations on AGI from an outsider, published by nicholashalden on May 24, 2023 on LessWrong.
Preamble
A lot of people have written against AI Doom, but I thought it might be interesting to give my account as an outsider encountering these arguments. Even if I don’t end up convincing people who have made AI alignment central to their careers and lives, maybe I’ll at least help some of them understand why the general public, and specifically the group of intelligent people which encounters their arguments, is generally not persuaded by their material. There may be inaccuracies in my account of the AI Doom argument, but this is how I think it’s generally understood by the average intelligent non-expert reader.
I started taking AI alignment arguments seriously when GPT-3 and GPT-4 came out, and started producing amazing results on standardized testing and writing tasks. I am not an ML engineer, do not know much about programming, and am not part of the rationalist community that has been structured around caring deeply about AI risk for the last fifteen years. It may be of interest that I am a professional forecaster, but of financial asset prices, not of geopolitical events or the success of nascent technologies. My knowledge of the arguments comes mostly from reading LessWrong, ACX and other online articles, and specifically I’m responding to Eliezer’s argument detailed in the pages on Orthogonality, Instrumental Convergence, and List of Lethalities (plus the recent Time article).
I. AI doom is unlikely, and it’s weird to me that clearly brilliant people think it’s >90% likely
I agree with the following points:
An AI can probably get much smarter than a human, and it’s only a matter of time before it does
Something being very smart doesn’t make it nice (orthogonality, I think)
A superintelligence doesn’t need to hate you to kill you; any kind of thing-maximizer might end up turning the atoms you’re made of into that thing without specifically wanting to destroy you (instrumental convergence, I think)
Computers hooked up to the internet have plenty of real-world capability via sending emails/crypto/bank account hacking/every other modern cyber convenience.
The argument then goes on to say that, if you take a superintelligence and tell it to build paperclips, it’s going to tile the universe with paperclips, killing everyone in the process (oversimplified). Since the people who use AI are obviously going to tell it to do stuff–we already do that with GPT-4–as soon as it gains superintelligence capabilities, our goose is collectively cooked. There is a separate but related argument, that a superintelligence would learn to self-modify, and instead of building the paperclips we asked it to, turn everything into GPUs so it can maximize some kind of reward counter. Both of these seem wrong to me.
The first argument–paperclip maximizing–is coherent in that it treats the AGI’s goal as fixed and given by a human (Paperclip Corp, in this case). But if that’s true, alignment is trivial, because the human can just give it a more sensible goal, with some kind of “make as many paperclips as you can without decreasing any human’s existence or quality of life by their own lights”, or better yet something more complicated that gets us to a utopia before any paperclips are made. We can argue over the hidden complexity of wishes, but it’s very obvious that there’s at least a good chance the populace would survive, so long as humans are the ones giving the AGI its goal. And, there’s a very good chance the first AGI-wishers will be people who care about AI safety, and not some random guy who wants to make a few million by selling paperclips.
At this point, the AGI-risk argument responds by saying, well, paperclip-maximizing is just a toy thought experiment for people to understand. In fac...
view more