Download - EA - My model of how different AI risks fit together by Stephen Clare

Discover

Podcast Features
Your all-in-one podcasting solution.

Podcast Studio
Easy-to-use audio recorder app.
Livestream
High-performing audio live, without limits.

Podcast App
The best podcast player & podcast app.
Podbean AI
AI-Enhanced Audio Quality and Content Generation.

Ads Marketplace
Join Ads Marketplace to earn money
through sponsorship on your podcast.

PodAds
Manage your ads with dynamic ad insertion capability.
Patron & Paid Content
The seamless way for fans to support you directly
from your podcast.
Apple Podcasts Subscriptions Integration
Effortlessly publish and manage exclusive episodes for your
Apple Podcasts subscribers directly from Podbean.

All Arts Business Comedy Education
Fiction Government Health & Fitness History Kids & Family
Leisure Music News Religion & Spirituality Science
Society & Culture Sports Technology True Crime TV & Film
Live

How to Start a Podcast
How to Start a Live Podcast
How to Monetize a podcast
How to Promote Your Podcast
How to Use Group Recording

Log in
Start your podcast for free

Podcasting
Monetization
Enterprise
Pricing
Discover

The Nonlinear Library: EA Forum

Education

EA - My model of how different AI risks fit together by Stephen Clare

2024-01-31

Download Right click and do "save link as"

Link to original article

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: My model of how different AI risks fit together, published by Stephen Clare on January 31, 2024 on The Effective Altruism Forum.[Crossposted from my Substack, Unfolding Atlas]How will AI get us in the end?Maybe it'll decide we're getting in its way and decide to take us out? It could fire all the nukes and unleash all the viruses and take us all down at onceOr maybe we'll take ourselves out? We could lose control of powerful autonomous weapons, or allow a 21st century Stalin set up an impenetrable surveillance state and obliterate freedom and progress forever.Or maybe the diffusion of AI throughout our global economy will become a quiet catastrophe? As more and more tasks are delegated to AI systems, we mere humans could be left helpless, like horses after the invention of cars. Alive, perhaps, but bewildered by a world too complex and fast-paced to be understood or controlled.Each of these scenarios has been proposed as a way the advent of advanced AI could cause a global catastrophe. But they seem quite different, and warrant different responses. In this post, I describe my model of how they fit together.[1]I divide the AI development process into three steps. Risks arise at each step. Despite its simplicity, this model does a good job pulling all these risks into one framework. It's helped me understand better how the many specific AI risks people have proposed fit together. More importantly, it satisfied my powerful, innate urge to force order onto a chaotic world.Terrible things come in threesThe three AI development stages in my model are training, deployment, and diffusion.At each stage, a different kind of AI risk arises. These are, respectively, misalignment, misuse, and systemic risks.Throughout the entire process, competitive pressures act as a risk factor. More pressure makes risks throughout the process more likely.Putting it all together looks like this:This model is too simple to be perfect. For one, these risks almost certainly won't arrive in sequence, as the model implies. They're also more entangled than a linear model implies. But I think these shortcomings are relatively minor, and relating categories of risk like this gives them a pleasant cohesiveness.[2]So let's move ahead and look closer at the risks generated at each step.Training and misalignment risksThe first set of risks emerge immediately after training an advanced AI model. Training modern AI models usually involves feeding them enormous datasets from which they can learn patterns and make predictions when given new data. Training risks relate to a model's alignment. That is, what the system wants to do, how it plans to do it, and whether those goals and methods are good for people.Some researchers worry that training a model to actually do what we want in all situations, or when we deploy it in the real world, is far from straightforward. In fact, we already see some kinds of alignment failures in the world today. These often seem silly. For example, something about the way Microsoft's first Bing chatbot's goal of being helpful and charming actually made it act like anoccasionally funny, occasionally frightening psychopath.As AI systems get more powerful, though, these misalignment risks could get a whole lot scarier. Specific risks researchers have raised include:Goal specification: It might be hard to tell AI systems exactly what we want them to do, especially as we delegate more complicated tasks to them.Some researchers worry that AIs trained on large amounts of data will either end up finding tricks or shortcuts that lead them to produce the wrong solutions when deployed in the real world. Or that they'll face extreme or different situations in the real world than they saw in the training data, and willreact in unexpected, potentially dangerous, ways.Power...