Download - LW - The Control Problem: Unsolved or Unsolvable? by Remmelt

Discover

Podcast Features
Your all-in-one podcasting solution.

Podcast Studio
Easy-to-use audio recorder app.
Livestream
High-performing audio live, without limits.

Podcast App
The best podcast player & podcast app.
Podbean AI
AI-Enhanced Audio Quality and Content Generation.

Ads Marketplace
Join Ads Marketplace to earn money
through sponsorship on your podcast.

PodAds
Manage your ads with dynamic ad insertion capability.
Patron & Paid Content
The seamless way for fans to support you directly
from your podcast.
Apple Podcasts Subscriptions Integration
Effortlessly publish and manage exclusive episodes for your
Apple Podcasts subscribers directly from Podbean.

All Arts Business Comedy Education
Fiction Government Health & Fitness History Kids & Family
Leisure Music News Religion & Spirituality Science
Society & Culture Sports Technology True Crime TV & Film
Live

How to Start a Podcast
How to Start a Live Podcast
How to Monetize a podcast
How to Promote Your Podcast
How to Use Group Recording

Log in
Start your podcast for free

Podcasting
Monetization
Enterprise
Pricing
Discover

The Nonlinear Library: LessWrong

Education

LW - The Control Problem: Unsolved or Unsolvable? by Remmelt

2023-06-04

Download Right click and do "save link as"

Link to original article

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The Control Problem: Unsolved or Unsolvable?, published by Remmelt on June 2, 2023 on LessWrong. td;lr No control method exists to safely contain the global feedback effects of self-sufficient learning machinery. What if this control problem turns out to be an unsolvable problem? Where are we two decades into resolving to solve a seemingly impossible problem? If something seems impossible. well, if you study it for a year or five, it may come to seem less impossible than in the moment of your snap initial judgment. Eliezer Yudkowsky, 2008 A list of lethalities.we are not on course to solve in practice in time on the first critical try; none of it is meant to make a much stronger claim about things that are impossible in principle Eliezer Yudkowsky, 2022 How do you interpret these two quotes, by a founding researcher, fourteen years apart? A. We indeed made comprehensive progress on the AGI control problem, and now at least the overall problem does not seem impossible anymore. B. The more we studied the overall problem, the more we uncovered complex sub-problems we'd need to solve as well, but so far can at best find partial solutions to. Which problems involving physical/information systems were not solved after two decades? Oh ye seekers after perpetual motion, how many vain chimeras have you pursued? Go and take your place with the alchemists. Leonardo da Vinci, 1494 No mathematical proof or even rigorous argumentation has been published demonstrating that the A[G]I control problem may be solvable, even in principle, much less in practice. Roman Yampolskiy, 2021 We cannot rely on the notion that if we try long enough, maybe AGI safety turns out possible after all. Historically, many researchers and engineers tried to solve problems that turned out impossible: perpetual motion machines that both conserve and disperse energy. uniting general relativity and quantum mechanics into some local variable theory. singular methods for 'squaring the circle', 'doubling the cube' or 'trisecting the angle'. distributed data stores where messages of data are consistent in their content, and also continuously available in a network that is also tolerant to partitions. formal axiomatic systems that are consistent, complete and decidable. Smart creative researchers of their generation came up with idealized problems. Problems that, if solved, would transform science, if not humanity. They plowed away at the problem for decades, if not millennia. Until some bright outsider proved by contradiction of the parts that the problem is unsolvable. Our community is smart and creative – but we cannot just rely on our resolve to align AI. We should never forsake our epistemic rationality, no matter how much something seems the instrumentally rational thing to do. Nor can we take comfort in the claim by a founder of this field that they still know it to be possible to control AGI to stay safe. Thirty years into running a program to secure the foundations of mathematics, David Hilbert declared “We must know. We will know!” By then, Kurt Gödel had constructed the first incompleteness theorem. Hilbert kept his declaration for his gravestone. Short of securing the foundations of safe AGI control – that is, through empirically-sound formal reasoning – we cannot rely on any researcher's pithy claim that "alignment is possible in principle". Going by historical cases, this problem could turn out solvable. Just really, really hard to solve. The flying machine seemed an impossible feat of engineering. Next, controlling a rocket’s trajectory to the moon seemed impossible. By the same reference class, ‘long-term safe AGI’ could turn out unsolvable – the perpetual motion machine of our time. It takes just one researcher to define the problem to be solved, reason from empirically sound premises, and arrive ...