Skip to content

Latest commit

 

History

History
49 lines (36 loc) · 5 KB

README.md

File metadata and controls

49 lines (36 loc) · 5 KB

Main workflow Release workflow Deploy Docs workflow Latest Release License

sre-monitoring-as-code

SRE Monitoring-as-Code (MaC) is a Jsonnet Mixin implementation of SLIs/SLO/Error Budgets using the open-source monitoring and alerting eco-system of Prometheus and Grafana. Our documentation is available to view online.

About the framework

Monitoring Mixins bundle up SLI configuration, Alerting, Grafana dashboards, and Runbooks into a single package. Engineers commit a monitoring definition file and this triggers the packaging of Prometheus Rules and Grafana Dashboards and injects them into the monitoring tools. This way, we can ease up engineers' burden of writing alerting rules, manually drawing up Grafana dashboards, and scribing runbooks.

  • Monitoring Mixins1 are a lightweight flexible configuration, which don’t mandate specific labels or expressions. You can configure and overwrite everything.
  • Mixins use data templating language called Jsonnet, which is the only templating language which has fully supported libraries for Grafana and Prometheus.
  • jsonnet-bundler is used for package management. Once you have a Monitoring Mixin package, you need to install it, keep track of versions and update them
  • SRE MaC will be open-sourced and live on UKHomeOffice GitHub and can be integrated with any Platform which supports pulling containers from GitHub.
  • SLI/SLO/Error Budget configurations match Google SRE2 industry patterns.

Repository structure

Directory Description
.githooks/ Contains the client-side pre-commit and pre-push git hooks which form part of our engineering workflow.
.github/ Contains the GitHub Action workflows and associated config.
docs/ Contains the technical documentation for Monitoring-as-Code using Tech Docs Template and Middleman.
example-apps/ Contains example apps to showcase how custom metrics can be shown within the MaC framework.
local/ Contains a docker-compose implementation of Prometheus, Thanos, Grafana and Alertmanager. The purpose of this project is to test Monitoring-as-Code locally with your application.
monitoring-as-code/ Contains the Jsonnet mixin implementation of SLIs/SLO/Error Budgets for Prometheus and Grafana.
security/ Contains the GitLeaks secret scan configuration.

Installation and usage information is provided in a Readme within each of the directories.

Resources

  1. Prometheus - Monitoring Mixins
  2. Google SRE - Implementing SLOs
  3. Google SRE - Setting SLOs: a step-by-step guide
  4. Liz Fong-Jones - Adopting SRE and Error Budgets
  5. GDS - Run a Service Level Indicator workshop

Licence

Unless stated otherwise, the codebase is released under the MIT License. This covers both the codebase and any sample code in the documentation.

The documentation is © Crown copyright and available under the terms of the Open Government 3.0 licence.