AF - Can Generalized Adversarial Testing Enable More Rigorous LLM Safety Evals? by Stephen Casper
The Nonlinear Library: Alignment Forum

AF - Can Generalized Adversarial Testing Enable More Rigorous LLM Safety Evals? by Stephen Casper

2024-07-30
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Can Generalized Adversarial Testing Enable More Rigorous LLM Safety Evals?, published by Stephen Casper on July 30, 2024 on The AI Alignment Forum. Thanks to Zora Che, Michael Chen, Andi Peng, Lev McKinney, Bilal Chughtai, Shashwat Goel, Domenic Rosati, and Rohit Gandikota. TL;DR In contrast to evaluating AI systems under...
View more
Comments (3)

More Episodes

All Episodes>>

Get this podcast on your phone, Free

Creat Yourt Podcast In Minutes

  • Full-featured podcast site
  • Unlimited storage and bandwidth
  • Comprehensive podcast stats
  • Distribute to Apple Podcasts, Spotify, and more
  • Make money with your podcast
Get Started
It is Free