Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio.
This is: Search versus design, published by Alex Flint on the AI Alignment Forum.
This work was supported by OAK, a monastic community in the Berkeley hills. It could not have been written without the daily love of living in this beautiful community. The work involved in writing this cannot be separated from the sitting, chanting, cooking, cleaning, crying, correcting, fundraising, listening, laughing, and teaching of the whole community.
This write-up benefited from feedback from David Kristofferson, Andrew Critch, Jason Crawford, Abram Demski, and Ben Pence. Mistakes and omissions are entirely the responsibility of the author.
How is it that we solve engineering problems? What is the nature of the design process that humans follow when building an air conditioner or computer program? How does this differ from the search processes present in machine learning and evolution?
We study search and design as distinct approaches to engineering. We argue that establishing trust in an artifact is tied to understanding how that artifact works, and that a central difference between search and design is the comprehensibility of the artifacts produced. We present a model of design as alternating phases of construction and factorization, resulting in artifacts composed of subsystems that are paired with helpful stories. We connect our ideas to the factored cognition thesis of Stuhlmüller and Christiano. We also review work in machine learning interpretability, including Chris Olah’s recent work on decomposing neural networks, Cynthia Rudin’s work on optimal simple models, and Mike Wu’s work on tree-regularized neural networks. We contrast these approaches with the joint production of artifacts and stories that we see in human design. Finally we ponder whether an AI safety research agenda could be formulated to automate design in a way that would make it competitive with search.
Introduction
Humans have been engineering artifacts for hundreds of thousands of years. Until recently, we seem to have mostly solved engineering problems using a method I will call design: understanding the materials at hand and building things up incrementally. This is the approach we use today when building bridges, web apps, sand castles, pencil sharpeners, air conditioners, and so on.
But a new and very different approach to engineering has recently come online. In this new approach, which I will call search, we specify an objective function, and then set up an optimization process to evaluate many possible artifacts, picking the one that is ranked highest by the objective function. This approach is not the automation of design; it’s internal workings are actually nothing like design, and the artifacts it produces are very unlike the artifacts produced by design.
The design approach to engineering produces artifacts that we can understand through decomposition. A car, for example, is decomposable into subsystems, each of which are further decomposable into parts, and so on down a long hierarchy. This decomposition is not at all simple, and low quality design processes can produce artifacts that are unnecessarily difficult to decompose, yet understanding how even a poorly designed car works is much easier than understanding how a biological tree works.
When we design an artifact, we seem to factorize it into comprehensible subsystems as we go. These subsystems are themselves artifacts resulting from a design process, and are constructed not just to be effective with respect to their intended purpose, but also to be comprehensible: that is, they are structured so as to permit a simple story to be told about them that helps us to understand how to work with them as building blocks in the construction of larger systems. I will call this an abstraction layer, and it consists of, so far as I can tell, an artifact to...
view more