Podbean logo
  • Discover
  • Podcast Features
    • Podcast Hosting

      Start your podcast with all the features you need.

    • Podbean AI Podbean AI

      AI-Enhanced Audio Quality and Content Generation.

    • Blog to Podcast

      Repurpose your blog into an engaging podcast.

    • Video to Podcast

      Convert YouTube playlists to podcasts, videos to audios.

  • Monetization
    • Ads Marketplace

      Join Ads Marketplace to earn through podcast sponsorships.

    • PodAds

      Manage your ads with dynamic ad insertion capability.

    • Apple Podcasts Subscriptions Integration

      Monetize with Apple Podcasts Subscriptions via Podbean.

    • Live Streaming

      Earn rewards and recurring income from Fan Club membership.

  • Podbean App
    • Podcast Studio

      Easy-to-use audio recorder app.

    • Podcast App

      The best podcast player & podcast app.

  • Help and Support
    • Help Center

      Get the answers and support you need.

    • Podbean Academy

      Resources and guides to launch, grow, and monetize podcast.

    • Podbean Blog

      Stay updated with the latest podcasting tips and trends.

    • What’s New

      Check out our newest and recently released features!

    • Podcasting Smarter

      Podcast interviews, best practices, and helpful tips.

  • Popular Topics
    • How to Start a Podcast

      The step-by-step guide to start your own podcast.

    • How to Start a Live Podcast

      Create the best live podcast and engage your audience.

    • How to Monetize a Podcast

      Tips on making the decision to monetize your podcast.

    • How to Promote Your Podcast

      The best ways to get more eyes and ears on your podcast.

    • Podcast Advertising 101

      Everything you need to know about podcast advertising.

    • Mobile Podcast Recording Guide

      The ultimate guide to recording a podcast on your phone.

    • How to Use Group Recording

      Steps to set up and use group recording in the Podbean app.

  • All Arts Business Comedy Education
  • Fiction Government Health & Fitness History Kids & Family
  • Leisure Music News Religion & Spirituality Science
  • Society & Culture Sports Technology True Crime TV & Film
  • Live
  • How to Start a Podcast
  • How to Start a Live Podcast
  • How to Monetize a podcast
  • How to Promote Your Podcast
  • How to Use Group Recording
  • Log in
  • Start your podcast for free
  • Podcasting
    • Podcast Features
      • Podcast Hosting

        Start your podcast with all the features you need.

      • Podbean AI Podbean AI

        AI-Enhanced Audio Quality and Content Generation.

      • Blog to Podcast

        Repurpose your blog into an engaging podcast.

      • Video to Podcast

        Convert YouTube playlists to podcasts, videos to audios.

    • Monetization
      • Ads Marketplace

        Join Ads Marketplace to earn through podcast sponsorships.

      • PodAds

        Manage your ads with dynamic ad insertion capability.

      • Apple Podcasts Subscriptions Integration

        Monetize with Apple Podcasts Subscriptions via Podbean.

      • Live Streaming

        Earn rewards and recurring income from Fan Club membership.

    • Podbean App
      • Podcast Studio

        Easy-to-use audio recorder app.

      • Podcast App

        The best podcast player & podcast app.

  • Advertisers
  • Enterprise
  • Pricing
  • Resources
    • Help and Support
      • Help Center

        Get the answers and support you need.

      • Podbean Academy

        Resources and guides to launch, grow, and monetize podcast.

      • Podbean Blog

        Stay updated with the latest podcasting tips and trends.

      • What’s New

        Check out our newest and recently released features!

      • Podcasting Smarter

        Podcast interviews, best practices, and helpful tips.

    • Popular Topics
      • How to Start a Podcast

        The step-by-step guide to start your own podcast.

      • How to Start a Live Podcast

        Create the best live podcast and engage your audience.

      • How to Monetize a Podcast

        Tips on making the decision to monetize your podcast.

      • How to Promote Your Podcast

        The best ways to get more eyes and ears on your podcast.

      • Podcast Advertising 101

        Everything you need to know about podcast advertising.

      • Mobile Podcast Recording Guide

        The ultimate guide to recording a podcast on your phone.

      • How to Use Group Recording

        Steps to set up and use group recording in the Podbean app.

  • Discover
  • Log in
    Sign up free
The Nonlinear Library

The Nonlinear Library

Education

AF - OthelloGPT learned a bag of heuristics by jylin04

AF - OthelloGPT learned a bag of heuristics by jylin04

2024-07-02
Download Right click and do "save link as"
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: OthelloGPT learned a bag of heuristics, published by jylin04 on July 2, 2024 on The AI Alignment Forum.
Work performed as a part of Neel Nanda's MATS 6.0 (Summer 2024) training program.
TLDR
This is an interim report on reverse-engineering Othello-GPT, an 8-layer transformer trained to take sequences of Othello moves and predict legal moves. We find evidence that Othello-GPT learns to compute the board state using many independent decision rules that are localized to small parts of the board.
Though we cannot rule out that it also learns a single succinct algorithm in addition to these rules, our best guess is that Othello-GPT's learned algorithm is just a bag of independent heuristics.
Board state reconstruction
1. Direct attribution to linear probes indicate that the internal board representation is frequently up- and down-weighted during a forward pass.
2. Case study of a decision rule:
1. MLP Neuron L1N421 represents the decision rule: If the move A4 was just played AND B4 is occupied AND C4 is occupied update B4+C4+D4 to "theirs". This rule does not generalize to translations across the board.
2. Another neuron L0377 participates in the implementation of this rule by checking if B4 is occupied, and inhibiting the activation of L1N421 if no.
Legal move prediction
1. A subset of neurons in mid to late MLP layers classify board configurations that are sufficient to make a certain move legal with an F1-score above 0.99. These neurons have high direct attribution to the logit for that move, and are causally relevant for legal move prediction.
2. Logit lens suggests that legal move predictions gradually solidify during a forward pass.
3. Some MLP neurons systematically activate at certain times in the game, regardless of the moves played so far. We hypothesize that these neurons encode heuristics about moves that are more probable in specific phases (early/mid/late) of the game.
Review of Othello-GPT
Othello-GPT is a transformer with 25M parameters trained on sequences of random legal moves in the board game Othello as inputs[1] to predict legal moves[2].
How it does this is a black box that we don't understand. Its claim to fame is that it supposedly
1. Learns an internal representation of the board state;
2. Uses it to predict legal moves
which if true, resolves the black box in two[3].
The evidence for the first claim is that linear probes work. Namely, for each square of the ground-truth game board, if we train a linear classifier to take the model's activations at layer 6 as input and predict logits for whether that square is blank, "mine" (i.e. belonging to the player whose move it currently is) or "yours", the probes work with high accuracy on games not seen in training.
The evidence for the second claim is that if we edit the residual stream until the probe's outputs change, the model's own output at the end of layer 7 becomes consistent with legal moves that are accessible from the new board state.
However, we don't yet understand what's going on in the remaining black boxes. In particular, although it would be interesting if Othello-GPT emergently learned to implement them via algorithms with relatively short description lengths, the evidence so far doesn't rule out the possibility that they could be implemented via a bag of heuristics instead.
Project goal
Our goal in this project was simply to figure out what's going on in the remaining black boxes.
1. What's going on in box #1 - how does the model compute the board representation?
1. How does the model decide if a cell is blank or not blank?
2. How does the model decide if a cell is "mine" or "yours"?
2. What's going on in box #2 - how does the model use the board representation to pick legal moves?
Results on box #1: Board reconstruction
A circuit for how the model computes if a cell is blank or not blan...
view more

More Episodes

EA - Sharing the AI Windfall: A Strategic Approach to International Benefit-Sharing by michel
2024-08-17
LW - Release: Optimal Weave (P1): A Prototype Cohabitive Game by mako yass
2024-08-17
LW - Rewilding the Gut VS the Autoimmune Epidemic by GGD
2024-08-17
LW - Rationalists are missing a core piece for agent-like structure (energy vs information overload) by tailcalled
2024-08-17
LW - Please support this blog (with money) by Elizabeth
2024-08-17
EA - Open Philanthropy is hiring people to… help hire more people! by maura
2024-08-17
EA - Report: The Broken State of Animal Advocacy in Universities by Dr Faraz Harsini
2024-08-17
LW - Principled Satisficing To Avoid Goodhart by JenniferRM
2024-08-17
EA - #196 - The edge cases of sentience and why they matter (Jonathan Birch on The 80,000 Hours Podcast) by 80000 Hours
2024-08-17
LW - How unusual is the fact that there is no AI monopoly? by Viliam
2024-08-17
AF - Calendar feature geometry in GPT-2 layer 8 residual stream SAEs by Patrick Leask
2024-08-17
EA - The Tech Industry is the Biggest Blocker to Meaningful AI Safety Regulations by Garrison
2024-08-16
LW - Investigating the Chart of the Century: Why is food so expensive? by Maxwell Tabarrok
2024-08-16
EA - This chart is right. Most interventions don't do much. (Cameroon experience) by EffectiveHelp - Cameroon
2024-08-16
LW - Demis Hassabis - Google DeepMind: The Podcast by Zach Stein-Perlman
2024-08-16
LW - Adverse Selection by Life-Saving Charities by vaishnav92
2024-08-16
EA - CEA is hiring a Head of Operations (apply by Sep 16) by JP Addison
2024-08-16
LW - Danger, AI Scientist, Danger by Zvi
2024-08-15
LW - A computational complexity argument for many worlds by jessicata
2024-08-15
EA - My article in The Nation - California's AI Safety Bill Is a Mask-Off Moment for the Industry by Garrison
2024-08-15
  • ←
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • →
012345678910111213141516171819

Get this podcast on your
phone, FREE

Download Podbean app on App Store Download Podbean app on Google Play

Create your
podcast in
minutes

  • Full-featured podcast site
  • Unlimited storage and bandwidth
  • Comprehensive podcast stats
  • Distribute to Apple Podcasts, Spotify, and more
  • Make money with your podcast
Get started

It is Free

  • Podcast Services

    • Podcast Features
    • Pricing
    • Enterprise Solution
    • Private Podcast
    • The Podcast App
    • Live Stream
    • Audio Recorder
    • Remote Recording
    • Podbean AI
  •  
    • Create a Podcast
    • Video Podcast
    • Start Podcasting
    • Start Radio Talk Show
    • Education Podcast
    • Church Podcast
    • Nonprofit Podcast
    • Get Sermons Online
    • Free Audiobooks
  • MONETIZATION & MORE

    • Podcast Advertising
    • Dynamic Ads Insertion
    • Apple Podcasts Subscriptions
    • Switch to Podbean
    • YouTube to Podcast
    • Blog to Podcast
    • Submit Your Podcast
    • Podbean Plugins
    • Developers
  • KNOWLEDGE BASE

    • How to Start a Podcast
    • How to Start a Live Podcast
    • How to Monetize a Podcast
    • How to Promote Your Podcast
    • Mobile Podcast Recording Guide
    • How to Use Group Recording
    • Podcast Advertising 101
  • Support

    • Support Center
    • What’s New
    • Free Webinars
    • Podcast Events
    • Podbean Academy
    • Podbean Amplified Podcast
    • Badges
    • Resources
  • Podbean

    • About Us
    • Podbean Blog
    • Careers
    • Press and Media
    • Green Initiative
    • Affiliate Program
    • Contact Us
  • Privacy Policy
  • Cookie Policy
  • Terms of Use
  • Consent Preferences
  • Copyright © 2015-2025 Podbean.com