Podbean logo
  • Discover
  • Podcast Features
    • Podcast Hosting

      Start your podcast with all the features you need.

    • Podbean AI Podbean AI

      AI-Enhanced Audio Quality and Content Generation.

    • Blog to Podcast

      Repurpose your blog into an engaging podcast.

    • Video to Podcast

      Convert YouTube playlists to podcasts, videos to audios.

  • Monetization
    • Ads Marketplace

      Join Ads Marketplace to earn through podcast sponsorships.

    • PodAds

      Manage your ads with dynamic ad insertion capability.

    • Apple Podcasts Subscriptions Integration

      Monetize with Apple Podcasts Subscriptions via Podbean.

    • Live Streaming

      Earn rewards and recurring income from Fan Club membership.

  • Podbean App
    • Podcast Studio

      Easy-to-use audio recorder app.

    • Podcast App

      The best podcast player & podcast app.

  • Help and Support
    • Help Center

      Get the answers and support you need.

    • Podbean Academy

      Resources and guides to launch, grow, and monetize podcast.

    • Podbean Blog

      Stay updated with the latest podcasting tips and trends.

    • What’s New

      Check out our newest and recently released features!

    • Podcasting Smarter

      Podcast interviews, best practices, and helpful tips.

  • Popular Topics
    • How to Start a Podcast

      The step-by-step guide to start your own podcast.

    • How to Start a Live Podcast

      Create the best live podcast and engage your audience.

    • How to Monetize a Podcast

      Tips on making the decision to monetize your podcast.

    • How to Promote Your Podcast

      The best ways to get more eyes and ears on your podcast.

    • Podcast Advertising 101

      Everything you need to know about podcast advertising.

    • Mobile Podcast Recording Guide

      The ultimate guide to recording a podcast on your phone.

    • How to Use Group Recording

      Steps to set up and use group recording in the Podbean app.

  • All Arts Business Comedy Education
  • Fiction Government Health & Fitness History Kids & Family
  • Leisure Music News Religion & Spirituality Science
  • Society & Culture Sports Technology True Crime TV & Film
  • Live
  • How to Start a Podcast
  • How to Start a Live Podcast
  • How to Monetize a podcast
  • How to Promote Your Podcast
  • How to Use Group Recording
  • Log in
  • Start your podcast for free
  • Podcasting
    • Podcast Features
      • Podcast Hosting

        Start your podcast with all the features you need.

      • Podbean AI Podbean AI

        AI-Enhanced Audio Quality and Content Generation.

      • Blog to Podcast

        Repurpose your blog into an engaging podcast.

      • Video to Podcast

        Convert YouTube playlists to podcasts, videos to audios.

    • Monetization
      • Ads Marketplace

        Join Ads Marketplace to earn through podcast sponsorships.

      • PodAds

        Manage your ads with dynamic ad insertion capability.

      • Apple Podcasts Subscriptions Integration

        Monetize with Apple Podcasts Subscriptions via Podbean.

      • Live Streaming

        Earn rewards and recurring income from Fan Club membership.

    • Podbean App
      • Podcast Studio

        Easy-to-use audio recorder app.

      • Podcast App

        The best podcast player & podcast app.

  • Advertisers
  • Enterprise
  • Pricing
  • Resources
    • Help and Support
      • Help Center

        Get the answers and support you need.

      • Podbean Academy

        Resources and guides to launch, grow, and monetize podcast.

      • Podbean Blog

        Stay updated with the latest podcasting tips and trends.

      • What’s New

        Check out our newest and recently released features!

      • Podcasting Smarter

        Podcast interviews, best practices, and helpful tips.

    • Popular Topics
      • How to Start a Podcast

        The step-by-step guide to start your own podcast.

      • How to Start a Live Podcast

        Create the best live podcast and engage your audience.

      • How to Monetize a Podcast

        Tips on making the decision to monetize your podcast.

      • How to Promote Your Podcast

        The best ways to get more eyes and ears on your podcast.

      • Podcast Advertising 101

        Everything you need to know about podcast advertising.

      • Mobile Podcast Recording Guide

        The ultimate guide to recording a podcast on your phone.

      • How to Use Group Recording

        Steps to set up and use group recording in the Podbean app.

  • Discover
  • Log in
    Sign up free
The Nonlinear Library: Alignment Forum

The Nonlinear Library: Alignment Forum

Education

AF - Access to powerful AI might make computer security radically easier by Buck Shlegeris

AF - Access to powerful AI might make computer security radically easier by Buck Shlegeris

2024-06-08
Download Right click and do "save link as"
Link to original article

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Access to powerful AI might make computer security radically easier, published by Buck Shlegeris on June 8, 2024 on The AI Alignment Forum.
People talk about model weight security being really hard and crucial around the advent of AGI. (E.g.
RAND report,
Leopold; see
here for some distinctions in these thread models that I think are important.) But I think that the thinking on this has not been sufficiently attentive to the fact that during that crucial time period, by assumption we'll have access to powerful AIs. I think that such access might make security wildly easier, by a bunch of different mechanisms, some of which I'll describe in this post.
The story I'm telling here is pretty different from the main story I've heard people talk about in the past about AIs helping with computer security, which is that AIs can help with hardening software and other infrastructure.
Though I agree that that seems really useful and important, here I'm not talking about AIs automating the process of security research, I'm talking about AI making it affordable to adopt policies that improve security in the organization but that wouldn't work (because they'd be too labor-intensive or lead to who-guards-the-guards problems) if you just had humans.
In this post I'm going to be ignoring problems related to AIs themselves trying to cause security failures. There are two ways that such AIs affect this situation:
The techniques I describe in this post could also be used, AI-control style, to use
trusted models to reduce the risk of security failures caused by untrusted models. You can basically just replace "humans" by "untrusted models" everywhere in the text below.
When applying these techniques to reduce risk from humans, you have to worry that these techniques will be sabotaged by scheming models. I'll ignore that possibility in this post, and talk about ways that you can improve security using AIs that are assumed not to be intentionally trying to subvert these security measures (which we call "
trusted"). If you're worried about your models scheming, you'll have to do some other stuff too (mostly various sting operations), but I think the arguments in this post still mostly hold up. I'll talk about this more in future.
I'm going to focus on insider threats here; most threats from outside humans involve first gaining access to impersonation of some insider.
Much of this post came out of thinking about the question "how could
AI control techniques be applied to humans, instead of AIs"?
I don't think these arguments are an airtight case for "SL5 security will be trivial". But they do seem like a plausible hypothesis under which infosec will end up being quite simple to resolve, if labs decide it's an important priority.
Four strategies for using powerful AI to improve security
Monitoring
As your humans take actions, you configure your system to have an AI that looks at these actions and then flags them for further review if they seem suspicious. If the actions seem potentially catastrophic, then the AI might even flag them before they can run.
Concretely, I'm imagining that instead of being allowed to directly run any command you write in the bash console of the computer you're working on, whenever you write a command, it and the surrounding context are shown to an AI, which is charged with deciding how suspicious the action is. If it is very suspicious, the AI blocks the command and sends an alert to someone else at the company who is supposed to look it over.
If the action is just somewhat suspicious, then it flags it for further review but does not block the action or alert the user.
This is somewhat similar to using
multi-party authorization. But with abundant AI labor, we could be more comprehensive, with fewer false positives and fewer false negatives, and evaluating a much grea...
view more

More Episodes

AF - Self-Other Overlap: A Neglected Approach to AI Alignment by Marc Carauleanu
2024-07-30
AF - Investigating the Ability of LLMs to Recognize Their Own Writing by Christopher Ackerman
2024-07-30
AF - Can Generalized Adversarial Testing Enable More Rigorous LLM Safety Evals? by Stephen Casper
2024-07-30
AF - AXRP Episode 34 - AI Evaluations with Beth Barnes by DanielFilan
2024-07-28
AF - A Solomonoff Inductor Walks Into a Bar: Schelling Points for Communication by johnswentworth
2024-07-26
AF - Pacing Outside the Box: RNNs Learn to Plan in Sokoban by Adrià Garriga-Alonso
2024-07-25
AF - Does robustness improve with scale? by ChengCheng
2024-07-25
AF - AI Constitutions are a tool to reduce societal scale risk by Samuel Dylan Martin
2024-07-25
AF - A framework for thinking about AI power-seeking by Joe Carlsmith
2024-07-24
AF - ML Safety Research Advice - GabeM by Gabe M
2024-07-23
AF - Analyzing DeepMind's Probabilistic Methods for Evaluating Agent Capabilities by Axel Højmark
2024-07-22
AF - Auto-Enhance: Developing a meta-benchmark to measure LLM agents' ability to improve other agents by Sam Brown
2024-07-22
AF - Coalitional agency by Richard Ngo
2024-07-22
AF - aimless ace analyzes active amateur: a micro-aaaaalignment proposal by Luke H Miles
2024-07-21
AF - A more systematic case for inner misalignment by Richard Ngo
2024-07-20
AF - BatchTopK: A Simple Improvement for TopK-SAEs by Bart Bussmann
2024-07-20
AF - Feature Targeted LLC Estimation Distinguishes SAE Features from Random Directions by Lidor Banuel Dabbah
2024-07-19
AF - Truth is Universal: Robust Detection of Lies in LLMs by Lennart Buerger
2024-07-19
AF - JumpReLU SAEs + Early Access to Gemma 2 SAEs by Neel Nanda
2024-07-19
AF - A List of 45+ Mech Interp Project Ideas from Apollo Research's Interpretability Team by Lee Sharkey
2024-07-18
  • ←
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • →
012345678910111213141516171819

Get this podcast on your
phone, FREE

Download Podbean app on App Store Download Podbean app on Google Play

Create your
podcast in
minutes

  • Full-featured podcast site
  • Unlimited storage and bandwidth
  • Comprehensive podcast stats
  • Distribute to Apple Podcasts, Spotify, and more
  • Make money with your podcast
Get started

It is Free

  • Podcast Services

    • Podcast Features
    • Pricing
    • Enterprise Solution
    • Private Podcast
    • The Podcast App
    • Live Stream
    • Audio Recorder
    • Remote Recording
    • Podbean AI
  •  
    • Create a Podcast
    • Video Podcast
    • Start Podcasting
    • Start Radio Talk Show
    • Education Podcast
    • Church Podcast
    • Nonprofit Podcast
    • Get Sermons Online
    • Free Audiobooks
  • MONETIZATION & MORE

    • Podcast Advertising
    • Dynamic Ads Insertion
    • Apple Podcasts Subscriptions
    • Switch to Podbean
    • YouTube to Podcast
    • Blog to Podcast
    • Submit Your Podcast
    • Podbean Plugins
    • Developers
  • KNOWLEDGE BASE

    • How to Start a Podcast
    • How to Start a Live Podcast
    • How to Monetize a Podcast
    • How to Promote Your Podcast
    • Mobile Podcast Recording Guide
    • How to Use Group Recording
    • Podcast Advertising 101
  • Support

    • Support Center
    • What’s New
    • Free Webinars
    • Podcast Events
    • Podbean Academy
    • Podbean Amplified Podcast
    • Badges
    • Resources
  • Podbean

    • About Us
    • Podbean Blog
    • Careers
    • Press and Media
    • Green Initiative
    • Affiliate Program
    • Contact Us
  • Privacy Policy
  • Cookie Policy
  • Terms of Use
  • Consent Preferences
  • Copyright © 2015-2025 Podbean.com