Monitoring And Alerting Essentials

Purpose

The following is a guide to understand improving system reliability through logging, monitoring, and alerting, and a process of continuous improvement.

If we want to consider ourselves 'engineers' our systems need to work reliably. When they do not work, we need to know when they are failing and why they are failing.

Content

There are several parts to this documentation.

Process of Making Unreliable Systems Reliable - A process to make incremental improments to existing systems
Slides for 'Shedding Light on Black Box Services' talk - Broad overview of why we like reliable systems and the tools we can use to get them
Logging Fundamentals - Cover all the basics of logging
Logging Architecture - Things to think about when choosing a logging framework
Site Reliability Engineering Book - Summary of the more valuable parts of the book
When to Conduct a Postmortem - A critical part of the continuous process

Scope

This documentation is primarily limited to application logging. OS, web service, and other types of logging will not be covered.

References

The majority of the ideas in this repo have been taken from the places where I learned them:

Google Site Reliability Engineering book
Prometheus documentation
- There is a great ChangeLog podcast where the creator of Prometheus discusses why the tool was built

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
LoggingArchitectures.md		LoggingArchitectures.md
LoggingFundamentals.md		LoggingFundamentals.md
ProcessOfMakingUnreliableSystemsReliable.md		ProcessOfMakingUnreliableSystemsReliable.md
README.md		README.md
SiteReliabilityEngineeringBook.md		SiteReliabilityEngineeringBook.md
Talk.md		Talk.md
WhenToConductAPostmortem.md		WhenToConductAPostmortem.md
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Monitoring And Alerting Essentials

Purpose

Content

Scope

References

About

Releases

Packages

Languages

gregberns/MonitoringAndAlertingEssentials

Folders and files

Latest commit

History

Repository files navigation

Monitoring And Alerting Essentials

Purpose

Content

Scope

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages