In this episode, we explore how the New York Times engineering team used AI agents to scale unit test coverage across their News site. They accomplished this by building a custom coverage measurement tool, designing a two-loop human–AI workflow, and investing heavily in prompt engineering, including strict guardrails to prevent the agent from cheating or drifting. The key takeaway is that AI works best when it is tightly constrained, carefully monitored, and used to amplify human judgment.
For more details, you can refer to their published tech blog, linked here for your reference: https://open.nytimes.com/how-the-new-york-times-is-scaling-unit-test-coverage-using-ai-tools-fa796bf9b8d2