Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The Control Problem: Unsolved or Unsolvable?, published by Remmelt on June 2, 2023 on LessWrong.
td;lr No control method exists to safely contain the global feedback effects of self-sufficient learning machinery. What if this control problem turns out to be an unsolvable problem?
Where are we two decades into resolving to solve a seemingly impossible problem?
If something seems impossible. well, if you study it for a year or five, it may come to seem less impossible than in the moment of your snap initial judgment.
Eliezer Yudkowsky, 2008
A list of lethalities.we are not on course to solve in practice in time on the first critical try; none of it is meant to make a much stronger claim about things that are impossible in principle
Eliezer Yudkowsky, 2022
How do you interpret these two quotes, by a founding researcher, fourteen years apart?
A. We indeed made comprehensive progress on the AGI control problem, and now at least the overall problem does not seem impossible anymore.
B. The more we studied the overall problem, the more we uncovered complex sub-problems we'd need to solve as well, but so far can at best find partial solutions to.
Which problems involving physical/information systems were not solved after two decades?
Oh ye seekers after perpetual motion, how many vain chimeras have you pursued? Go and take your place with the alchemists.
Leonardo da Vinci, 1494
No mathematical proof or even rigorous argumentation has been published demonstrating that the A[G]I control problem may be solvable, even in principle, much less in practice.
Roman Yampolskiy, 2021
We cannot rely on the notion that if we try long enough, maybe AGI safety turns out possible after all.
Historically, many researchers and engineers tried to solve problems that turned out impossible:
perpetual motion machines that both conserve and disperse energy.
uniting general relativity and quantum mechanics into some local variable theory.
singular methods for 'squaring the circle', 'doubling the cube' or 'trisecting the angle'.
distributed data stores where messages of data are consistent in their content, and also continuously available in a network that is also tolerant to partitions.
formal axiomatic systems that are consistent, complete and decidable.
Smart creative researchers of their generation came up with idealized problems. Problems that, if solved, would transform science, if not humanity. They plowed away at the problem for decades, if not millennia. Until some bright outsider proved by contradiction of the parts that the problem is unsolvable.
Our community is smart and creative – but we cannot just rely on our resolve to align AI. We should never forsake our epistemic rationality, no matter how much something seems the instrumentally rational thing to do.
Nor can we take comfort in the claim by a founder of this field that they still know it to be possible to control AGI to stay safe.
Thirty years into running a program to secure the foundations of mathematics, David Hilbert declared “We must know. We will know!” By then, Kurt Gödel had constructed the first incompleteness theorem. Hilbert kept his declaration for his gravestone.
Short of securing the foundations of safe AGI control – that is, through empirically-sound formal reasoning – we cannot rely on any researcher's pithy claim that "alignment is possible in principle".
Going by historical cases, this problem could turn out solvable. Just really, really hard to solve. The flying machine seemed an impossible feat of engineering. Next, controlling a rocket’s trajectory to the moon seemed impossible.
By the same reference class, ‘long-term safe AGI’ could turn out unsolvable – the perpetual motion machine of our time. It takes just one researcher to define the problem to be solved, reason from empirically sound premises, and arrive ...
view more