My notes from the DevOps Handbook

by Gene Kim, Jez Humble, Patrick Debois, John Willis

Create telemetry to enable seeing and solving problems

To enable disciplined problem solving behavior, we need to design our systems so that they are continually creating telemetry, widely defined as automated communications process by which measurements and other data are collected at remote points and are subsequently transmitted to receiving equipment for monitoring.

Our goal is to create telemetry within our applications and environments, both in production and preproduction environments as well as in our deployment pipeline.

Centralized telemetry infrastructure

In order for us to see all problems as they occur, we must design and develop our applications and environments so that they generate sufficient telemetry, allowing us to understand how our system is behaving as a whole. When all levels of our application stack have monitoring and logging, we enable other important capabilities, such as graphing and visualizing our metrics, anomaly detection, proactive alerting and escalation.

Components of such architecture