Podcasting
Advertisers
Enterprise
Pricing
Resources
Discover Discover

Log in
Sign up free

Meta Tech Podcast

84: Trust But Canary: Configuration Safety at Scale

2026-04-02

Download

Have you ever wondered how Meta makes config rollouts safe at scale? In this episode, Pascal sits down with Ishwari and Joe to discuss Meta's approach for propagating changes across services in seconds and discuss why speed increases the need for strong safeguards. Catch the episode to discover canarying and progressive rollouts, the health checks and monitoring signals used to catch regressions early, and how incident reviews focus on improving systems rather than blaming people. We also hear how data and early AI/ML are slashing alert noise and speeding up bisecting when something goes wrong. Got feedback? Send it to us on Threads (https://threads.net/@metatechpod), Instagram (https://instagram.com/metatechpod) and don't forget to follow our host Pascal (https://mastodon.social/@passy, https://threads.net/@passy_)....

Have you ever wondered how Meta makes config rollouts safe at scale? In this episode, Pascal sits down with Ishwari and Joe to discuss Meta's approach for propagating changes across services in seconds and discuss why speed increases the need for strong safeguards. Catch the episode to discover canarying and progressive rollouts, the health checks and monitoring signals used to catch regressions early, and how incident reviews focus on improving systems rather than blaming people. We also hear how data and early AI/ML are slashing alert noise and speeding up bisecting when something goes wrong.

Got feedback? Send it to us on Threads (https://threads.net/@metatechpod), Instagram (https://instagram.com/metatechpod) and don't forget to follow our host Pascal (https://mastodon.social/@passy, https://threads.net/@passy_). Fancy working with us? Check out https://www.metacareers.com/.

Links

FFmpeg at Meta: Media Processing at Scale - https://engineering.fb.com/2026/03/02/video-engineering/ffmpeg-at-meta-media-processing-at-scale/
Reliably Changing Configuration @ Scale - https://atscaleconference.com/reliably-changing-configuration-scale/

Timestamps

Intro 0:06
Introduction and Overview of Configuration Changes 2:31
Understanding Configurations in Distributed Systems 4:02
Meta's Configuration Management Systems 6:43
Safeguards and Incident Prevention 9:22
Deployment Mechanisms: Canary and Progressive Rollouts 12:06
Challenges in Configuration Consumption 14:39
Health Checks and Incident Response 17:13
Mitigation Strategies for Configuration Issues 19:18
Balancing Developer Velocity and Configuration Safety 21:09
Data-Driven Improvements in Incident Management 22:12
Leveraging AI for Change Detection 26:05
Challenges in Deployment and Testing 28:21
Reinventing Change Safety Strategies 30:24
War Stories: Learning from Past Incidents 32:59
Outro 36:10

View more

Comments (3)

More Episodes

You may also like

Ham Radio Crash Course Podcast

MPIR Old Time Radio

Conversations on the Creek

Elliot in the Morning

The Ultimate Art Bell Podcast Feed

All-In with Chamath, Jason, Sacks & Friedberg

Agatha Christie BBC Dramatisations

Chapo Trap House & Feed download failed

The AI Daily Brief: Artificial Intelligence News and Analysis

Get this podcast on your phone, Free

Create Your Podcast In Minutes

Full-featured podcast site
Unlimited storage and bandwidth
Comprehensive podcast stats
Distribute to Apple Podcasts, Spotify, and more
Make money with your podcast

It is Free

Podcast Services
MONETIZATION & MORE
KNOWLEDGE BASE
Support
Podbean

Privacy Policy
Cookie Policy
Terms of Use
Consent Preferences
Copyright © 2015-2026 Podbean.com