Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: What's next for the field of Agent Foundations?, published by Nora Ammann on November 30, 2023 on LessWrong.
Alexander, Matt and I want to chat about the field of Agent Foundations (AF), where it's at and how to strengthen and grow it going forward.
We will kick off by each of us making a first message...
Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: What's next for the field of Agent Foundations?, published by Nora Ammann on November 30, 2023 on LessWrong.
Alexander, Matt and I want to chat about the field of Agent Foundations (AF), where it's at and how to strengthen and grow it going forward.
We will kick off by each of us making a first message outlining some of our key beliefs and open questions at the moment. Rather than giving a comprehensive take, the idea is to pick out 1-3 things we each care about/think are important, and/or that we are confused about/would like to discuss. We may respond to some subset of the following prompts:
Where is the field of AF at in your view? How do you see the role of AF in the larger alignment landscape/with respect to making AI futures go well? Where would you like to see it go? What do yo use as some of the key bottlenecks for getting there? What are some ideas you have about how we might overcome them?
Before we launch in properly, just a few things that seem worth clarifying:
By Agent Foundations, we mean roughly speaking conceptual and formal work towards understanding the foundations of agency, intelligent behavior and alignment. In particular, we mean something broader than what one might call "old-school MIRI-type Agent Foundations", typically informed by fields such as decision theory and logic.
We will not specifically be discussing the value or theory of change behind Agent Foundations research in general. We think these are important conversations to have, but in this specific dialogue, our goal is a different one, namely: assuming AF is valuable, how can we strengthen the field?
Should it look more like a normal research field?
The main question I'm interested in about agent foundations at the moment is whether it should continue in its idiosyncratic current form, or whether it should start to look more like an ordinary academic field.
I'm also interested in discussing theories of change, to the extent it has bearing on the other question.
Why agent foundations?
My own reasoning for foundational work on agency being a potentially fruitful direction for alignment research is:
Most misalignment threat models are about agents pursuing goals that we'd prefer they didn't pursue (I think this is not controversial)
Existing formalisms about agency don't seem all that useful for understanding or avoiding those threats (again probably not that controversial)
Developing new and more useful ones seems tractable (this is probably more controversial)
The main reason I think it might be tractable is that so far not that many person-hours have gone into trying to do it. A priori it seems like the sort of thing you can get a nice mathematical formalism for, and so far I don't think that we've collected much evidence that you can't.
So I think I'd like to get a large number of people with various different areas of expertise thinking about it, and I'd hope that some small fraction of them discovered something fundamentally important. And a key question is whether the way the field currently works is conducive to that.
Does it need a new name?
Does Agent Foundations-in-the-broad-sense need a new name?
Is the name 'Agent Foundations' cursed?
Suggestions I've heard are
'What are minds', 'what are agents'. 'mathematical alignment'. 'Agent Mechanics'
Epistemic Pluralism and Path to Impact
Some thought snippets:
(1) Clarifying and creating common knowledge about the scope of Agent Foundations and strengthening epistemic pluralism
I think it's important for the endeavors of meaningfully improving our understanding of such fundamental phenomena as agency, intelligent behavior, etc. that one has a relatively pluralistic portfolio of angles on it. The world is very detailed, phenomena like agency/intelligent behavior/etc. seem like maybe particularly "messy"/detailed phenomena. Insofar ...
View more