How to dodge vanity metrics and measure the right service level indicator and why?| blog.alexewerlof.com
In this guide, we’ll show technologies and examples of full stack observability for an application running on Kubernetes, OpenTelemetry and AWS.| Logz.io
Intricacies of on-call rotations at Google, including strategies for optimizing pager load, psychological safety, and fostering effective teams.| sre.google
I have, as they say, some personal news to share. On Monday I (along with some very talented teammates, see below if you’re hiring) was laid off from Microsoft as part of a reorganization. Like my Moving to Microsoft post, I wanted to jot down some of the things I got to work on. For those of you wondering, the Planetary Computer project does continue, just without me. Reflections It should go without saying that all of this was a team effort. I’ve been incredibly fortunate to have great ...| tomaugspurger.net
White box testing examines the software’s internal structures, code, and logic to ensure the application is working properly.| QA Touch
Google's SRE team uses time-series data and alerting systems to monitor large-scale services. Collecting, storing, and querying time-series data.| sre.google
How to serial and parallel dependencies affect the total SLA| Medium
Gain visibility into your systems with monitoring system. Monitor metrics, text logs, structured event logging, and event introspection.| sre.google
Don't start your first shift unprepared!| sheepcode.substack.com