Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: What I Would Do If I Were Working On AI Governance, published by johnswentworth on December 8, 2023 on LessWrong.
I don't work in AI governance, and am unlikely to do so in the future. But various anecdotes and, especially,
Akash's recent discussion leave me with the impression that few-if-any people are...
Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: What I Would Do If I Were Working On AI Governance, published by johnswentworth on December 8, 2023 on LessWrong.
I don't work in AI governance, and am unlikely to do so in the future. But various anecdotes and, especially,
Akash's recent discussion leave me with the impression that few-if-any people are doing the sort of things which I would consider sensible starting points, and instead most people are mostly doing things which do not seem-to-me to address any important bottleneck to useful AI governance.
So this post lays out the places I would start, if I were working on AI governance, and some of the reasoning behind them.
No doubt I am missing lots of important things! Perhaps this post will nonetheless prove useful to others working in AI governance, perhaps
Cunningham's Law will result in me learning useful things as a result of this post, perhaps both. I expect that the specific suggestions in this post are more likely to be flawed than the style of reasoning behind them, and I therefore recommend paying more attention to the reasoning than the specific suggestions.
This post will be mostly US-focused, because that is what I know best and where all the major AI companies are, but presumably versions of the interventions discussed could also carry over to other polities.
Liability
One major area I'd focus on is making companies which build AI liable for the damages caused by that AI, both de-facto and de-jure.
Why Liability?
The vague goal here is to get companies which build AI to:
Design from the start for systems which will very robustly not cause problems.
Invest resources in red-teaming, discovering new failure-modes before they come up in production, etc.
Actually not deploy systems which raise red flags, even when the company has invested heavily in building those systems.
In general, act as though the company will take losses from damages caused by their AI, not just capture profits from the benefits caused by their AI.
… and one natural way to do that is to ensure that companies do, in fact, take losses from damages caused by their AI, not just capture profits from the benefits caused by their AI. That's liability in a nutshell.
Now, realistically, this is not going to extend all the way to e.g.
making companies buy extinction insurance. So why do realistic levels of liability matter for extinction risk? Because they incentivize companies to put in place safety processes with any actual teeth at all.
For instance: right now, lots of people are working on e.g. safety evals. My very strong expectation is that, if and when those evals throw red flags, the major labs will respond by some combination of (1) having some meetings where people talk about safety a bunch, (2) fine-tuning until the red flags are no longer thrown (in a way which will obviously not robustly remove the underlying problems), and then (3) deploying it anyway, under heavy pressure from the CEO of Google/Microsoft/Amazon and/or Sam Altman.
On the other hand, if an AI company has already been hit with lots of expensive lawsuits for problems caused by their AI, then I expect them to end up with a process which will test new models in various ways, and then actually not deploy them if red flags come up. They will have already done the "fine tune until red light stops flashing" thing a few times, and paid for it when their fine tuning failed to actually remove problems in deployment.
Another way to put it: liability forces a company to handle the sort of
organizational problems which are a central bottleneck to making any sort of AI safety governance basically-real, rather than basically-fake. It forces the organizational infrastructure/processes needed for safety mechanisms with teeth.
For a great case study of how liability solved a similar problem in another area, check out Jason Crawf...
View more