Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Distillation of Neurotech and Alignment Workshop January 2023, published by lisathiergart on May 22, 2023 on LessWrong.
Disclaimer: This post is preliminary and doesn't yet fully align with the rigorous standards we typically aim for in LessWrong publications. It recaps the dynamic discussions from the neurotech for AI alignment workshop and therefore does not necessarily represent any single participant’s or author’s viewpoint. Shared ahead of the Foresight WBE workshop, our intent is to foster enriching discussion among participants. We welcome your input as we continue refining these insights.
Introduction
This document reviews key insights from a January workshop exploring the potential of Neurotech to contribute to the AI alignment problem. Neurotech, or neurotechnology, refers to the set of tools and methods designed to enhance understanding of the brain, manipulate its function, or interface directly with neural circuits, often for therapeutic or augmentation purposes. The AI alignment problem is the challenge of ensuring that the behavior of an artificial (general) intelligence system is in line with human values and intentions.
The workshop, attended by 16 technical researchers and domain specialists, aimed to jointly map out whether and how neurotech can be useful towards alignment. Our core goal was to cast the net wide for all possible solutions, with an emphasis on those relevant to shortened alignment timelines. This initiative was a follow-up to the Paris Foresight Vision Weekend in November 2022, where it became evident that Neurotech-based approaches to alignment warranted further investigation. The two-hour interactive workshop aimed to compile a comprehensive list of potential approaches and prioritize them based on technical feasibility and timeline relevance.
The workshop kicked off with a keynote presentation from David Dalrymple that laid the groundwork for participant discussion and defined key working definitions. This was followed by a brainstorming session, break-out room discussions, cluster identification and idea sorting, preliminary cluster prioritization (based on alignment impact, feasibility, timeline, and cost), and analysis of the prioritization results. David's keynote included a figure illustrating a potential breakdown of research intersecting Brain-Computer-Interface and Alignment.
Several key insights from the workshop include
Workshop participants brainstormed together, identifying six clusters of neuro-based alignment approaches:
BCIs to extract human knowledge
neurotech to enhance humans
understanding human value formation
cyborgism
whole brain emulation
BCIs creating a reward signal.
Each cluster aims to align AI with human values but with highly separable/unique approaches.
These categories likely don’t cover all potential approaches. In this post, we map these approaches with the hope that future workshops can explore and extrapolate to other less-common approaches that could hold potential.
The workshop identified trends that agreed with commonly held assumptions. For example, participants found that complex technologies take longer to develop but have a greater potential impact on AI alignment. The relationship between cost and the other variables remains unclear. Future workshops should address cost and expected value to better guide investment in the space.
Neurotech hardware development faces potentially long timelines that may fall behind the advent of AGI, which presents a major risk for all neurotech-based tradeoffs.
Neurotech for AI safety is in an early phase with high uncertainty. During this phase, exploring various potential solutions, even unlikely ones, is valuable. This can lead to crucial insights and guide more focused investments, increasing the chances of finding effective AI alignment strategies.
The...
view more