The Nonlinear Library: EA Forum
Education
EA - On the future of language models by Owen Cotton-Barratt
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: On the future of language models, published by Owen Cotton-Barratt on December 21, 2023 on The Effective Altruism Forum.1. Introduction1.1 Summary of key claimsEven without further breakthroughs in AI, language models will have big impacts in the coming years, as people start sorting out proper applicationsThe early important applications will be automation of expert advisors, management, and perhaps software developmentThe more transformative but harder prizes are automation of research and automation of executive capacityIn their most straightforward form ("foundation models"), language models are a technology which naturally scales to something in the vicinity of human-level (because it's about emulating human outputs), not one that naturally shoots way past human-level performancei.e. it is a mistake-in-principle to imagine projecting out the GPT-2 - GPT-3 - GPT-4 capability trend into the far-superhuman rangeAlthough they're likely to be augmented by things which accelerate progress, this still increases the likelihood of a relatively slow takeoff - several years (rather than weeks or months) of transformative growth before truly wild things are happening seems plausibleNB version of "speed superintelligence" could still be transformative even while performance on individual tasks is still firmly human levelThere are two main techniques which can be used (probably in conjunction) to get language models to do more powerful things than foundation models are capable of:Scaffolding: structured systems to provide appropriate prompts, including as a function of previous answersFinetuning: altering model weights to select for task performance on a particular taskEach of these techniques has a path to potentially scale to strong superintelligence; alternatively language models might at some point be obsoleted by another form of AITimelines for any of these things seem pretty unclearFrom a safety perspective, language model agents whose agency comes from scaffolding look greatly superior than ones whose agency comes from finetuningBecause you can get an extremely high degree of transparency by constructionFinetuning is more likely an important tool for instilling virtues (e.g. honesty) in systemsSutton's Bitter Lesson raises questions for this strategy, but needn't mean it's doomed to be outcompetedOn the likely development trajectory there are a number of distinct existential riskse.g. guarding against takeover from early language model agents is pretty different from differential technological development to ensure that we automate safety-enhancing research before risk-increasing researchThe current portfolio of work on AI risk is over-indexed on work which treats "transformative AI" as a black box and tries to plan around that. I think that we can and should be peering inside that box (and this may involve plans targeted at more specific risks).1.2 MetaWe know that AI is likely to be a very transformative technology. But a lot of the analysis of this point treats something like "AGI" as a black box, without thinking too much about the underlying tech which gets there. I think that's a useful mode, but it's also helpful to look at specific forms of AI technology and ask where they're going and what the implications are.This doc does that for language models. It's a guide for thinking about them from various angles with an eye to what the strategic implications might be. Basically I've tried to write the thing I wish I'd read a couple of years ago; I'm sharing now in case it's helpful for others.The epistemic status of this is "I thought pretty hard about this and these are my takes"; I'm sure there are still holes in my thinking (NB I don't actually do direct work with language models), and I'd appreciate pushback; but I'm also pretty sure I'm ...
Create your
podcast in
minutes
It is Free