Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: How to do conceptual research: Case study interview with Caspar Oesterheld, published by Chi Nguyen on May 14, 2024 on LessWrong.
Caspar Oesterheld came up with two of the most important concepts in my field of work:
Evidential Cooperation in Large Worlds and
Safe Pareto Improvements. He also came up...
Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: How to do conceptual research: Case study interview with Caspar Oesterheld, published by Chi Nguyen on May 14, 2024 on LessWrong.
Caspar Oesterheld came up with two of the most important concepts in my field of work:
Evidential Cooperation in Large Worlds and
Safe Pareto Improvements. He also came up with a potential implementation of evidential decision theory in boundedly rational agents called
decision auctions, wrote a comprehensive
review of anthropics and how it interacts with decision theory which most of my anthropics discussions built on, and independently decided to work on AI some time late 2009 or early 2010.
Needless to say, I have a lot of respect for Caspar's work. I've often felt very confused about what to do in my attempts at conceptual research, so I decided to ask Caspar how he did his research. Below is my writeup from the resulting conversation.
How Caspar came up with surrogate goals
The process
Caspar had spent six months FTE thinking about a specific bargaining problem between two factions with access to powerful AI, spread over two years.
A lot of the time was spent on specific somewhat narrow research projects, e.g. modelling the impact of moral advocacy in China on which bargaining problems we'll realistically encounter in the future. At the time, he thought those particular projects were important although he maybe already had a hunch that he wouldn't think so anymore ten years down the line.
At the same time, he also spent some time on most days thinking about bargaining problems on a relatively high level, either in discussions or on walks. This made up some double digit percentage of his time spent researching bargaining problems.
Caspar came up with the idea of surrogate goals during a conversation with Tobias Baumann. Caspar describes the conversation leading up to the surrogate goal idea as "going down the usual loops of reasoning about bargaining" where you consider just building values into your AI that have properties that are strategically advantaged in bargaining but then worrying that this is just another form of aggressive bargaining.
The key insight was to go "Wait, maybe there's a way to make it not so bad for the other side." Hence, counterpart-friendly utility function modifications were born which later on turned into surrogate goals.
Once he had the core idea of surrogate goals, he spent some time trying to figure out what the general principle behind "this one weird trick" he found was. Thus, with Vincent Conitzer as his co-author, his
SPI paper was created and he continues trying to answer this question now.
Caspar's reflections on what was important during the process
He thinks it was important to just have spent a ton of time, in his case six months FTE, on the research area. This helps with building useful heuristics.
It's hard or impossible and probably fruitless to just think about a research area on an extremely high level. "You have to pass the time somehow." His particular projects, for example researching moral advocacy in China, served as a way of "passing the time" so to say.
At the same time, he thinks it is both very motivationally hard and perhaps not very sensible to work on something that's in the roughly right research area where you really can't see a direct impact case. You can end up wasting a bunch of time grinding out technical questions that have nothing much to do with anything.
Relatedly, he thinks it was really important that he continued doing some high-level thinking about bargaining alongside his more narrow projects.
He describes a common dynamic in high-level thinking: Often you get stuck on something that's conceptually tricky and just go through the same reasoning loops over and over again, spread over days, weeks, months, or years. You usually start entering the loop because you think...
View more