Podbean logo
  • Discover
  • Podcast Features
    • Podcast Hosting

      Start your podcast with all the features you need.

    • Podbean AI Podbean AI

      AI-Enhanced Audio Quality and Content Generation.

    • Blog to Podcast

      Repurpose your blog into an engaging podcast.

    • Video to Podcast

      Convert YouTube playlists to podcasts, videos to audios.

  • Monetization
    • Ads Marketplace

      Join Ads Marketplace to earn through podcast sponsorships.

    • PodAds

      Manage your ads with dynamic ad insertion capability.

    • Apple Podcasts Subscriptions Integration

      Monetize with Apple Podcasts Subscriptions via Podbean.

    • Live Streaming

      Earn rewards and recurring income from Fan Club membership.

  • Podbean App
    • Podcast Studio

      Easy-to-use audio recorder app.

    • Podcast App

      The best podcast player & podcast app.

  • Help and Support
    • Help Center

      Get the answers and support you need.

    • Podbean Academy

      Resources and guides to launch, grow, and monetize podcast.

    • Podbean Blog

      Stay updated with the latest podcasting tips and trends.

    • What’s New

      Check out our newest and recently released features!

    • Podcasting Smarter

      Podcast interviews, best practices, and helpful tips.

  • Popular Topics
    • How to Start a Podcast

      The step-by-step guide to start your own podcast.

    • How to Start a Live Podcast

      Create the best live podcast and engage your audience.

    • How to Monetize a Podcast

      Tips on making the decision to monetize your podcast.

    • How to Promote Your Podcast

      The best ways to get more eyes and ears on your podcast.

    • Podcast Advertising 101

      Everything you need to know about podcast advertising.

    • Mobile Podcast Recording Guide

      The ultimate guide to recording a podcast on your phone.

    • How to Use Group Recording

      Steps to set up and use group recording in the Podbean app.

  • All Arts Business Comedy Education
  • Fiction Government Health & Fitness History Kids & Family
  • Leisure Music News Religion & Spirituality Science
  • Society & Culture Sports Technology True Crime TV & Film
  • Live
  • How to Start a Podcast
  • How to Start a Live Podcast
  • How to Monetize a podcast
  • How to Promote Your Podcast
  • How to Use Group Recording
  • Log in
  • Start your podcast for free
  • Podcasting
    • Podcast Features
      • Podcast Hosting

        Start your podcast with all the features you need.

      • Podbean AI Podbean AI

        AI-Enhanced Audio Quality and Content Generation.

      • Blog to Podcast

        Repurpose your blog into an engaging podcast.

      • Video to Podcast

        Convert YouTube playlists to podcasts, videos to audios.

    • Monetization
      • Ads Marketplace

        Join Ads Marketplace to earn through podcast sponsorships.

      • PodAds

        Manage your ads with dynamic ad insertion capability.

      • Apple Podcasts Subscriptions Integration

        Monetize with Apple Podcasts Subscriptions via Podbean.

      • Live Streaming

        Earn rewards and recurring income from Fan Club membership.

    • Podbean App
      • Podcast Studio

        Easy-to-use audio recorder app.

      • Podcast App

        The best podcast player & podcast app.

  • Advertisers
  • Enterprise
  • Pricing
  • Resources
    • Help and Support
      • Help Center

        Get the answers and support you need.

      • Podbean Academy

        Resources and guides to launch, grow, and monetize podcast.

      • Podbean Blog

        Stay updated with the latest podcasting tips and trends.

      • What’s New

        Check out our newest and recently released features!

      • Podcasting Smarter

        Podcast interviews, best practices, and helpful tips.

    • Popular Topics
      • How to Start a Podcast

        The step-by-step guide to start your own podcast.

      • How to Start a Live Podcast

        Create the best live podcast and engage your audience.

      • How to Monetize a Podcast

        Tips on making the decision to monetize your podcast.

      • How to Promote Your Podcast

        The best ways to get more eyes and ears on your podcast.

      • Podcast Advertising 101

        Everything you need to know about podcast advertising.

      • Mobile Podcast Recording Guide

        The ultimate guide to recording a podcast on your phone.

      • How to Use Group Recording

        Steps to set up and use group recording in the Podbean app.

  • Discover
  • Log in
    Sign up free
The Inside View

The Inside View

Technology

[Crosspost] Adam Gleave on Vulnerabilities in GPT-4 APIs (+ extra Nathan Labenz interview)

[Crosspost] Adam Gleave on Vulnerabilities in GPT-4 APIs (+ extra Nathan Labenz interview)

2024-05-17
Download Right click and do "save link as"

This is a special crosspost episode where Adam Gleave is interviewed by Nathan Labenz from the Cognitive Revolution. At the end I also have a discussion with Nathan Labenz about his takes on AI.


Adam Gleave is the founder of Far AI, and with Nathan they discuss finding vulnerabilities in GPT-4's fine-tuning and Assistant PIs, Far AI's work exposing exploitable flaws in "superhuman" Go AIs through innovative adversarial strategies, accidental jailbreaking by naive developers during fine-tuning, and more.


OUTLINE


(00:00) Intro

(02:57) NATHAN INTERVIEWS ADAM GLEAVE: FAR.AI's Mission

(05:33) Unveiling the Vulnerabilities in GPT-4's Fine Tuning and Assistance APIs

(11:48) Divergence Between The Growth Of System Capability And The Improvement Of Control

(13:15) Finding Substantial Vulnerabilities

(14:55) Exploiting GPT 4 APIs: Accidentally jailbreaking a model

(18:51) On Fine Tuned Attacks and Targeted Misinformation

(24:32) Malicious Code Generation

(27:12) Discovering Private Emails

(29:46) Harmful Assistants

(33:56) Hijacking the Assistant Based on the Knowledge Base

(36:41) The Ethical Dilemma of AI Vulnerability Disclosure

(46:34) Exploring AI's Ethical Boundaries and Industry Standards

(47:47) The Dangers of AI in Unregulated Applications

(49:30) AI Safety Across Different Domains

(51:09) Strategies for Enhancing AI Safety and Responsibility

(52:58) Taxonomy of Affordances and Minimal Best Practices for Application Developers

(57:21) Open Source in AI Safety and Ethics

(1:02:20) Vulnerabilities of Superhuman Go playing AIs

(1:23:28) Variation on AlphaZero Style Self-Play

(1:31:37) The Future of AI: Scaling Laws and Adversarial Robustness

(1:37:21) MICHAEL TRAZZI INTERVIEWS NATHAN LABENZ

(1:37:33) Nathan’s background

(01:39:44) Where does Nathan fall in the Eliezer to Kurzweil spectrum

(01:47:52) AI in biology could spiral out of control

(01:56:20) Bioweapons

(02:01:10) Adoption Accelerationist, Hyperscaling Pauser

(02:06:26) Current Harms vs. Future Harms, risk tolerance 

(02:11:58) Jailbreaks, Nathan’s experiments with Claude


The cognitive revolution: https://www.cognitiverevolution.ai/


Exploiting Novel GPT-4 APIs: https://far.ai/publication/pelrine2023novelapis/


Advesarial Policies Beat Superhuman Go AIs: https://far.ai/publication/wang2022adversarial/

view more

More Episodes

Owain Evans - AI Situational Awareness, Out-of-Context Reasoning
2024-08-23
Ethan Perez on Selecting Alignment Research Projects (ft. Mikita Balesni & Henry Sleight)
2024-04-09
Emil Wallner on Sora, Generative AI Startups and AI optimism
2024-02-20
Evan Hubinger on Sleeper Agents, Deception and Responsible Scaling Policies
2024-02-12
[Jan 2023] Jeffrey Ladish on AI Augmented Cyberwarfare and compute monitoring
2024-01-27
Holly Elmore on pausing AI
2024-01-22
Podcast Retrospective and Next Steps
2024-01-09
Paul Christiano's views on "doom" (ft. Robert Miles)
2023-09-29
Neel Nanda on mechanistic interpretability, superposition and grokking
2023-09-21
Joscha Bach on how to stop worrying and love AI
2023-09-08
Erik Jones on Automatically Auditing Large Language Models
2023-08-11
Dylan Patel on the GPU Shortage, Nvidia and the Deep Learning Supply Chain
2023-08-09
Tony Wang on Beating Superhuman Go AIs with Advesarial Policies
2023-08-04
David Bau on Editing Facts in GPT, AI Safety and Interpretability
2023-08-01
Alexander Pan on the MACHIAVELLI benchmark
2023-07-26
Vincent Weisser on Funding AI Alignment Research
2023-07-24
[JUNE 2022] Aran Komatsuzaki on Scaling, GPT-J and Alignment
2023-07-19
Curtis Huebner on Doom, AI Timelines and Alignment at EleutherAI
2023-07-16
Eric Michaud on scaling, grokking and quantum interpretability
2023-07-12
  • ←
  • 1
  • 2
  • 3
  • →
02345678910111213141516171819

Get this podcast on your
phone, FREE

Download Podbean app on App Store Download Podbean app on Google Play

Create your
podcast in
minutes

  • Full-featured podcast site
  • Unlimited storage and bandwidth
  • Comprehensive podcast stats
  • Distribute to Apple Podcasts, Spotify, and more
  • Make money with your podcast
Get started

It is Free

  • Podcast Services

    • Podcast Features
    • Pricing
    • Enterprise Solution
    • Private Podcast
    • The Podcast App
    • Live Stream
    • Audio Recorder
    • Remote Recording
    • Podbean AI
  •  
    • Create a Podcast
    • Video Podcast
    • Start Podcasting
    • Start Radio Talk Show
    • Create a Podcast for Spotify
    • Education Podcast
    • Church Podcast
    • Get Sermons Online
    • Free Audiobooks
  • MONETIZATION & MORE

    • Podcast Advertising
    • Dynamic Ads Insertion
    • Apple Podcasts Subscriptions
    • AI Podcast Creator
    • Blog to Podcast
    • YouTube to Podcast
    • Submit Your Podcast
    • Switch to Podbean
    • Podbean Plugins
  • KNOWLEDGE BASE

    • How to Start a Podcast
    • How to Start a Live Podcast
    • How to Monetize a Podcast
    • How to Promote Your Podcast
    • Mobile Podcast Recording Guide
    • How to Use Group Recording
    • Podcast Advertising 101
  • Support

    • Support Center
    • What’s New
    • Free Webinars
    • Podcast Events
    • Podbean Academy
    • Podbean Amplified Podcast
    • Badges
    • Resources
    • Developers
  • Podbean

    • About Us
    • Podbean Blog
    • Careers
    • Press and Media
    • Green Initiative
    • Affiliate Program
    • Contact Us
  • Privacy Policy
  • Cookie Policy
  • Terms of Use
  • Consent Preferences
  • Copyright © 2015-2026 Podbean.com