This guide documents some common rituals we encourage in our work.
These are not ironclad processes we should always follow. More like a helpful jumping-off point if you're on a project where they would be useful.
“A postmortem is a written record of an incident, its impact, the actions taken to mitigate or resolve it, the root cause(s), and the follow-up actions to prevent the incident from recurring.” — Google's Site Reliability Engineering
This ritual is held when a serious incident happens, such as a site outage, and the team needs to understand what happened and how to avoid it in the future.