Podcasting
Advertisers
Enterprise
Pricing
Resources
Discover Discover

Log in
Sign up free

LessWrong (Curated & Popular)

“AI companies aren’t really using external evaluators” by Zach Stein-Perlman

2024-05-24

New blog: AI Lab Watch. Subscribe on Substack.Many AI safety folks think that METR is close to the labs, with ongoing relationships that grant it access to models before they are deployed. This is incorrect. METR (then called ARC Evals) did pre-deployment evaluation for GPT-4 and Claude 2 in the first half of 2023, but it seems to have had no special access since then.[1] Other model evaluators also seem to have little access before deployment.Frontier AI labs' pre-deployment risk assessment should involve external model evals for...

New blog: AI Lab Watch. Subscribe on Substack.

Many AI safety folks think that METR is close to the labs, with ongoing relationships that grant it access to models before they are deployed. This is incorrect. METR (then called ARC Evals) did pre-deployment evaluation for GPT-4 and Claude 2 in the first half of 2023, but it seems to have had no special access since then.[1] Other model evaluators also seem to have little access before deployment.

Frontier AI labs' pre-deployment risk assessment should involve external model evals for dangerous capabilities.[2] External evals can improve a lab's risk assessment and—if the evaluator can publish its results—provide public accountability.

The evaluator should get deeper access than users will get.

To evaluate threats from a particular deployment protocol, the evaluator should get somewhat deeper access than users will — then the evaluator's failure to elicit dangerous capabilities is stronger evidence [...]

The original text contained 5 footnotes which were omitted from this narration.

---

First published:
May 24th, 2024

Source:
https://www.lesswrong.com/posts/WjtnvndbsHxCnFNyc/ai-companies-aren-t-really-using-external-evaluators

---

Narrated by TYPE III AUDIO.

View more

Comments (3)

More Episodes

You may also like

TheQuartering’s Podcast

MPIR Old Time Radio

Ham Radio Crash Course Podcast

Podbean Amplified

The Ultimate Art Bell Podcast Feed

Lex Fridman Podcast

Elliot in the Morning

The Wheel of Time

All-In with Chamath, Jason, Sacks & Friedberg

Darknet Diaries

Get this podcast on your phone, Free

Create Your Podcast In Minutes

Full-featured podcast site
Unlimited storage and bandwidth
Comprehensive podcast stats
Distribute to Apple Podcasts, Spotify, and more
Make money with your podcast

It is Free

Podcast Services
MONETIZATION & MORE
KNOWLEDGE BASE
Support
Podbean

Privacy Policy
Cookie Policy
Terms of Use
Consent Preferences
Copyright © 2015-2025 Podbean.com