Join Ads Marketplace to earn through podcast sponsorships.
Manage your ads with dynamic ad insertion capability.
Monetize with Apple Podcasts Subscriptions via Podbean.
Earn rewards and recurring income from Fan Club membership.
Get the answers and support you need.
Resources and guides to launch, grow, and monetize podcast.
Stay updated with the latest podcasting tips and trends.
Check out our newest and recently released features!
Podcast interviews, best practices, and helpful tips.
The step-by-step guide to start your own podcast.
Create the best live podcast and engage your audience.
Tips on making the decision to monetize your podcast.
The best ways to get more eyes and ears on your podcast.
Everything you need to know about podcast advertising.
The ultimate guide to recording a podcast on your phone.
Steps to set up and use group recording in the Podbean app.
Join Ads Marketplace to earn through podcast sponsorships.
Manage your ads with dynamic ad insertion capability.
Monetize with Apple Podcasts Subscriptions via Podbean.
Earn rewards and recurring income from Fan Club membership.
Get the answers and support you need.
Resources and guides to launch, grow, and monetize podcast.
Stay updated with the latest podcasting tips and trends.
Check out our newest and recently released features!
Podcast interviews, best practices, and helpful tips.
The step-by-step guide to start your own podcast.
Create the best live podcast and engage your audience.
Tips on making the decision to monetize your podcast.
The best ways to get more eyes and ears on your podcast.
Everything you need to know about podcast advertising.
The ultimate guide to recording a podcast on your phone.
Steps to set up and use group recording in the Podbean app.
[Crosspost] Adam Gleave on Vulnerabilities in GPT-4 APIs (+ extra Nathan Labenz interview)
This is a special crosspost episode where Adam Gleave is interviewed by Nathan Labenz from the Cognitive Revolution. At the end I also have a discussion with Nathan Labenz about his takes on AI.
Adam Gleave is the founder of Far AI, and with Nathan they discuss finding vulnerabilities in GPT-4's fine-tuning and Assistant PIs, Far AI's work exposing exploitable flaws in "superhuman" Go AIs through innovative adversarial strategies, accidental jailbreaking by naive developers during fine-tuning, and more.
OUTLINE
(00:00) Intro
(02:57) NATHAN INTERVIEWS ADAM GLEAVE: FAR.AI's Mission
(05:33) Unveiling the Vulnerabilities in GPT-4's Fine Tuning and Assistance APIs
(11:48) Divergence Between The Growth Of System Capability And The Improvement Of Control
(13:15) Finding Substantial Vulnerabilities
(14:55) Exploiting GPT 4 APIs: Accidentally jailbreaking a model
(18:51) On Fine Tuned Attacks and Targeted Misinformation
(24:32) Malicious Code Generation
(27:12) Discovering Private Emails
(29:46) Harmful Assistants
(33:56) Hijacking the Assistant Based on the Knowledge Base
(36:41) The Ethical Dilemma of AI Vulnerability Disclosure
(46:34) Exploring AI's Ethical Boundaries and Industry Standards
(47:47) The Dangers of AI in Unregulated Applications
(49:30) AI Safety Across Different Domains
(51:09) Strategies for Enhancing AI Safety and Responsibility
(52:58) Taxonomy of Affordances and Minimal Best Practices for Application Developers
(57:21) Open Source in AI Safety and Ethics
(1:02:20) Vulnerabilities of Superhuman Go playing AIs
(1:23:28) Variation on AlphaZero Style Self-Play
(1:31:37) The Future of AI: Scaling Laws and Adversarial Robustness
(1:37:21) MICHAEL TRAZZI INTERVIEWS NATHAN LABENZ
(1:37:33) Nathan’s background
(01:39:44) Where does Nathan fall in the Eliezer to Kurzweil spectrum
(01:47:52) AI in biology could spiral out of control
(01:56:20) Bioweapons
(02:01:10) Adoption Accelerationist, Hyperscaling Pauser
(02:06:26) Current Harms vs. Future Harms, risk tolerance
(02:11:58) Jailbreaks, Nathan’s experiments with Claude
The cognitive revolution: https://www.cognitiverevolution.ai/
Exploiting Novel GPT-4 APIs: https://far.ai/publication/pelrine2023novelapis/
Advesarial Policies Beat Superhuman Go AIs: https://far.ai/publication/wang2022adversarial/
2024-01-22
2024-01-09
2023-07-26
Create your
podcast in
minutes
It is Free