My notes from the DevOps Handbook

by Gene Kim, Jez Humble, Patrick Debois, John Willis

Have developers initially self-manage their service

In our group, most system administrators laster only six months. Things were always breaking in production, the hours were insane, and deployments were painful beyond belief.

Have dev groups self manage their services in production before they become eligible for a centralized ops group to manage.

Launch requirements that must be met in order for services to interact with real customers and be exposed to real production traffic. Ops engineers should act as consultants.

Launch guidance requirements will likely include the following:

Defect counts and severity: does the application actually perform as designed?
Type/frequency of pager alerts: is the application generating an unsupportable number of alerts in production?
Monitoring coverage: is the coverage of monitoring sufficient to restore service when things go wrong
System architecture: is the service loosely coupled enough to support high rate of changes and deployments in production?
Deployment process: is there a predictable, deterministic, and sufficiently automated process to deploy code into production?
Production hygiene: is there enough good production habits that would allow support to be managed by anyone else

If any deficiencies are found during the review, the assigned ops engineer should help the feature team resolve the issues or even help re-engineer the service if necessary, so that it can be easily deployed and managed in production.

Does the service need regulatory compliance?

Does the service generate a significant amount of revenue
Does it have high user traffic or outage costs
Does it store payment information or other personal details
Does it have any other regulatory compliance like PCI-DSS

For services already in production we need a different mechanism to ensure that ops is never stuck with an unsupportable service.

Service handback mechanism - when a production service becomes so fragile that ops hand it back to developers to manage. In this stage, ops act as consultants to make the service production ready.

Never put ops in a situation where they have to manage a fragile service while ever increasing technical debt buries them causing local problem to become a global one. Ensure that ops have enough capacity to work on improvements and preventive projects.