“AI companies aren’t really using external evaluators” by Zach Stein-Perlman
LessWrong (Curated & Popular)

“AI companies aren’t really using external evaluators” by Zach Stein-Perlman

2024-05-24
New blog: AI Lab Watch. Subscribe on Substack.Many AI safety folks think that METR is close to the labs, with ongoing relationships that grant it access to models before they are deployed. This is incorrect. METR (then called ARC Evals) did pre-deployment evaluation for GPT-4 and Claude 2 in the first half of 2023, but it seems to have had no special access since then.[1] Other model evaluators also seem to have little access before deployment.Frontier AI labs' pre-deployment risk assessment should involve external model evals for...
View more
Comments (3)

More Episodes

All Episodes>>

Get this podcast on your phone, Free

Create Your Podcast In Minutes

  • Full-featured podcast site
  • Unlimited storage and bandwidth
  • Comprehensive podcast stats
  • Distribute to Apple Podcasts, Spotify, and more
  • Make money with your podcast
Get Started
It is Free