A common refrain you’ll hear these days is that servers should be scaled out, easy to replace, and interchangeable—cattle, not pets. But for the ops folks who run those servers the opposite is true. You can’t just throw any of them into an incident where they may not know the stack or system and expect everything to work out. Every operator has a set of skills that they’ve built up through research or experience, and teams should value them as such. They’re people, not pets, and certainly not cattle—you can’t just get a new one when you burn out your existing ones.
On this episode of the podcast—sponsored by Chronosphere—we talk with Paige Cruz, Senior Developer Advocate at Chronosphere, about how teams can reduce the cognitive load on ops, the best ways to prepare for inevitable failures, and where the worst place to page Paige is.
Episode notes:
Chronosphere provides an observability platform for ops people, so naturally, the company has an interest in the happiness of those people.
If you’re interested in the history of the pets vs. cattle concept , this covers it pretty well.
Previously, we spoke with the CEO of Chronosphere about making incidents easier to manage.
We’ve covered this topic on the blog before, and two articles came up during our conversation with Paige.
You can connect with Paige on Twitter, where she has a pretty apropos handle.
Congrats to Stellar Question badge winner Bruno Rocha for asking How can I read large text files line by line, without loading them into memory?, which at least 100 users liked enough to bookmark.
Can software startups that need $$$ avoid venture captial?
An open-source development paradigm
Would you board a plane safety-tested by GenAI?
How to train your dream machine
OverflowAI and the holy grail of search
Spreading the gospel of Python
Between hyper-focus and burnout: Developing with ADHD
Reshaping the future of API platforms
The reverse mullett model of software engineering
Net neutrality is in; TikTok and noncompetes are out
Supporting the world’s most-used database engine through 2050
Is GenAI the next dot-com bubble?
Why configuration is so complicated
If everyone is building AI, why aren't more projects in production?
How do you evaluate an LLM? Try an LLM.
Diverting more backdoor disasters
Climbing the GenAI decision tree
Want to be a great software engineer? Don’t be a jerk.
What a year building AI has taught Stack Overflow
Are long context windows the end of RAG?
Create your
podcast in
minutes
It is Free
Insight Story: Tech Trends Unpacked
Zero-Shot
Fast Forward by Tomorrow Unlocked: Tech past, tech future
The Unbelivable Truth - Series 1 - 26 including specials and pilot
Lex Fridman Podcast