Download - How do we prepare for final crunch time?Q by Eli Tyre

Discover

Podcast Features
Your all-in-one podcasting solution.

Podcast Studio
Easy-to-use audio recorder app.
Livestream
High-performing audio live, without limits.

Podcast App
The best podcast player & podcast app.
Podbean AI
AI-Enhanced Audio Quality and Content Generation.

Ads Marketplace
Join Ads Marketplace to earn money
through sponsorship on your podcast.

PodAds
Manage your ads with dynamic ad insertion capability.
Patron & Paid Content
The seamless way for fans to support you directly
from your podcast.
Apple Podcasts Subscriptions Integration
Effortlessly publish and manage exclusive episodes for your
Apple Podcasts subscribers directly from Podbean.

All Arts Business Comedy Education
Fiction Government Health & Fitness History Kids & Family
Leisure Music News Religion & Spirituality Science
Society & Culture Sports Technology True Crime TV & Film
Live

How to Start a Podcast
How to Start a Live Podcast
How to Monetize a podcast
How to Promote Your Podcast
How to Use Group Recording

Log in
Start your podcast for free

Podcasting
Monetization
Enterprise
Pricing
Discover

The Nonlinear Library: Alignment Forum Top Posts

Education

How do we prepare for final crunch time?Q by Eli Tyre

2021-12-05

Download Right click and do "save link as"

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: How do we prepare for final crunch time?Q , published by Eli Tyre on the AI Alignment Forum. [Crossposted from Musings and Rough Drafts.] Epistemic status: Brainstorming and first draft thoughts. Inspired by something that Ruby Bloom wrote and the Paul Christiano episode of the 80,000 hours podcast.] One claim I sometimes hear about AI alignment [paraphrase]: It is really hard to know what sorts of AI alignment work are good this far out from transformative AI. As we get closer, we’ll have a clearer sense of what AGI / Transformative AI is likely to actually look like, and we’ll have much better traction on what kind of alignment work to do. In fact, MOST of the work of AI alignment is done in the final few years (or months) before AGI, when we’ve solved most of the hard capabilities problems already so we know what AGI will look like and we can work directly, with good feedback loops, on the sorts of systems that we want to align. Usually, this is said to argue that to value of the alignment research being done today is primarily that of enabling, future, more critical, alignment work. But “progress in the field” is only one dimension to consider in boosting and unblocking the work of alignment researchers in this last stretch. In this post I want to take the above posit seriously, and consider the implications. If most of the alignment work that will be done is going to be done in the final few years before the deadline, our job in 2021 is mostly to do everything that we can to enable the people working on the problem in the crucial period (which might be us, or our successors, or both) so that they are as well equipped as we can possibly make them. What are all the ways that we can think of that we can prepare now, for our eventual final exam? What should we be investing in, to improve our efficacy in those final, crucial, years? The following are some ideas. [In this post, I'm going to refer to this last stretch of a few months to a few years, "final crunch time", as distinct from just "crunch time", ie this century.] Access For this to matter, our alignment researchers need to be at the cutting edge of AI capabilities, and they need to be positioned such that their work can actually be incorporated into AI systems as they are deployed. A different kind of work Most current AI alignment work is pretty abstract and theoretical, for two reasons. The first reason is a philosophical / methodological claim: There’s a fundamental “nearest unblocked strategy” / overfitting problem. Patches that correct clear and obvious alignment failures are unlikely to generalize fully, they'll only constrain unaligned optimization to channels that you can’t recognize. For this reason, some claim, we need to have an extremely robust, theoretical understanding of intelligence and alignment, ideally at the level of proofs. The second reason is a practical consideration: we just don’t have powerful AI systems to work with, so there isn’t much that can be done in the way of tinkering and getting feedback. That second objection becomes less relevant in final crunch time: in this scenario, we’ll have powerful systems 1) that will be built along the same lines as the systems that it is crucial to align and 2) that will have enough intellectual capability to pose at least semi-realistic “creative” alignment failures (ie, current systems are so dumb, and live in such constrained environments, that it isn’t clear how much we can learn about aligning literal superintelligences from them.) And even if the first objection ultimately holds, theoretical understanding often (usually?) follows from practical engineering proficiency. It seems like it might be a fruitful path to tinker with semi-powerful systems: trying out different alignment approaches empirically, and tinkering to discover new approa...