Observability

Observability

What's Observability?

In distributed systems, observability is the ability to collect data about programs' execution, modules' internal states, and the communication among components.
To improve observability, software engineers use a wide range of logging and tracing techniques to gather telemetry information, and tools to analyze and use it.
Observability is foundational to site reliability engineering, as it is the first step in triaging a service outage.^[1]

Monitoring

What's monitoring? How is it related to Observability?

Google: "Monitoring is one of the primary means by which service owners keep track of a system’s health and availability".

What types of monitoring outputs are you familiar with and/or used in the past?

Alerts
Tickets
Logging

Data

Can you mention what type of things are often montiored in the IT industry?

Hardware (CPU, RAM, ...)

Infrastructure (Disk capacity, Network latency, ...)

App (Status code, Errors in logs, ...)

Explain "Time Series" data

Time series data is sequenced data, measuring certain parameter in ordered (by time) way.

An example would be CPU utilization every hour:

08:00 17 09:00 22 10:00 91

Explain data aggregation

In monitoring, aggregating data is basically combining collection of values. It can be done in different ways like taking the average of multiple values, the sum of them, the count of many times they appear in the collection and other ways that mainly depend on the type of the collection (e.g. time-series would be one type).

Application Performance Management

What is Application Performance Management?

IT metrics translated into business insights

Practices for monitoring applications insights so we can improve performances, reduce issues and improve overall user experience

Name three aspects of a project you can monitor with APM (e.g. backend)

Frontend

Backend

Infra

...

What can be collected/monitored to perform APM monitoring?

Metrics

Logs

Events

Traces

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Observability

Monitoring

Data

Application Performance Management

Files

README.md

Latest commit

History

README.md

File metadata and controls

Observability

Monitoring

Data

Application Performance Management