Site Reliability Engineering Book Trio
What works for Google probably doesn't apply to the rest of us. Three SRE books — the SRE Book, the Workbook, and Seeking SRE — worth reading with that caveat firmly in mind.
3 posts
What works for Google probably doesn't apply to the rest of us. Three SRE books — the SRE Book, the Workbook, and Seeking SRE — worth reading with that caveat firmly in mind.
An alert nobody acts on is just noise with a timestamp. Five properties of a worthy alert — actionable, owned, contextualized, tunable, state-change aware — and why each one matters.
The most interesting Prometheus work in 2018 wasn't inside Prometheus. PromCon highlights: Grafana's query explorer, Thanos, OpenMetrics, model builder, and automated benchmarking from the ecosystem around it.