Goodmetrics

Light, fast, unlimited cardinality, service-focused time series metrics.

Overview

About

Goodmetrics is for monitoring web service workflows: It records contextual observations from workflows rather than the contextless numbers of other collection systems. (Think "single-level trace" and you're close)

Leveraging Postgresql and the Timescaledb plugin, Goodmetrics creates a simple, familiar wide schema for your application workflow; a column per dimension and a column per measurement.

Getting started

TimescaleDB

https://docs.timescale.com/install/latest/self-hosted/installation-debian/
Timescale docker
Timescale Cloud

Set up timescaledb if you are self-hosting

create database metrics;
\c metrics
create extension timescaledb;
create extension timescaledb_toolkit;
create role metrics_write;
grant create on database metrics to metrics_write;
create user metrics in group metrics_write;
\password metrics

create role metrics_read;
grant pg_read_all_data to metrics_read;
create user grafana in group metrics_read;
\password grafana

Run the server

You can use the latest release's goodmetricsd or you can use docker via

# Or instead of -p you can --network host
docker run --name goodmetrics -p 9573:9573 --detach kvc0/goodmetrics -- \
  --connection-string 'host=postgres_server_ip_address port=2345 user=metrics password=metrics'

Send metrics

Use an SDK or just invoke the latest release's goodmetrics cli utility. Here's an example sending 2 observations of the same metric with a few dimensions and a few different measurement types:

goodmetrics send '
{
  "metric":"test_api",
  "unix_nanos":'`date +%s`'000000000,
  "dimensions":{
    "a_string_dimension":{"value":{"String":"asdf"}},
    "an_integer_dimension":{"value":{"Number":16}},
    "a_boolean_dimension":{"value":{"Boolean":true}}
  },
  "measurements":{
    "an_int_measurement":{"value":{"I32":42}},
    "a_long_measurement":{"value":{"I64":42}},
    "a_float_measurement":{"value":{"F32":42.42}},
    "a_double_measurement":{"value":{"F64":42.42}},
    "a_statistic_set":{"value":{"StatisticSet": {"minimum":1, "maximum":2, "samplesum":8, "samplecount":6}}},
    "a_histogram":{"value":{"Histogram":{"buckets":{"1":2, "3":4, "5":6}}}},
    "a_tdigest":{"value":{"Tdigest":{"sum":42.3,"count":42,"min":1,"max":1.3,"centroids":[{"mean":1,"weight":41},{"mean":1.3,"weight":1}]}}}
  }
} ' '{
  "metric":"test_api",
  "unix_nanos":'`date +%s`'000000000,
  "dimensions":{
    "a_string_dimension":{"value":{"String":"asdf"}},
    "an_integer_dimension":{"value":{"Number":16}},
    "a_boolean_dimension":{"value":{"Boolean":true}}
  },
  "measurements":{
    "an_int_measurement":{"value":{"I32":42}},
    "a_long_measurement":{"value":{"I64":42}},
    "a_float_measurement":{"value":{"F32":42.42}},
    "a_double_measurement":{"value":{"F64":42.42}},
    "a_statistic_set":{"value":{"StatisticSet": {"minimum":1, "maximum":2, "samplesum":8, "samplecount":6}}},
    "a_histogram":{"value":{"Histogram":{"buckets":{"1":2, "3":4, "5":6}}}},
    "a_tdigest":{"value":{"Tdigest":{"sum":42.3,"count":42,"min":1,"max":1.3,"centroids":[{"mean":1,"weight":41},{"mean":1.3,"weight":1}]}}}
  }
}'

Configurations

Upstreams

Goodmetrics SDK's. If you're a service developer this is where to look.
goodmetrics cli. If you're scripting some bach this might be your ticket.
Prometheus. If you're stuck with this then okay. You can use goodmetrics to adapt it.

Downstreams

TimescaleDB. The good way; with simple, rich and easy to graph wide tables.
OpenTelemetry otlp. Strips your measurements' relationships to express them as otel types. This is for compatibility. Most otlp metrics stores will struggle with Goodmetrics cardinality.

On healing

Goodmetrics self-heals schema, and thinks that data from now is most important.

When you have bad data, drop table problematic_table cascade and you're good. If you change a column's data type (illegal) and you didn't change the name, just alter table problematic_table drop column problematic_column. It will recreate that column with the currently-reported type.

When there's a problem with data or connections, data gets dropped. Goodmetrics doesn't queue for very long, favoring your service's time to recovery and the now over the nice-to-have of data from time gone by.

Data model

TimescaleDB Direct

Goodmetrics type	Timescale type	about
`time`	timestamptz	The 1 required column, used as the time column for hypertables. It is provided by goodmetrics
int_dimension	int8/bigint	A 64 bit integer
str_dimension	text	A label
bool_dimension	boolean	A flag
i64	int8/bigint	A 64 bit integer
i32	int4/int	A 32 bit integer
f64	float8	A 64 bit floating point number
f32	float4	A 32 bit floating point number
statistic_set	statistic_set	A preaggregated {min,max,sum,count} rollup of some value. Has convenience functions for graphing and rollups.
histogram	histogram	Implemented as jsonb. Has convenience functions for graphing and rollups.
t_digest [beta]	tdigest	Fancy space-constrained and high-speed histogram sketch. Uses timescaledb_toolkit functions for graphing.

OpenTelemetry (compatibility)

Goodmetrics type	OpenTelemetry Metrics type	about
`time`	time_unix_nano	Goodmetrics clients timestamp at the start of their workflows by default
int_dimension	Int attribute	A 64 bit integer
str_dimension	String attribute	A label
bool_dimension	Bool attribute	A flag
i64	Number data point (i64)	A 64 bit integer
i32	Number data point (i64)	OpenTelemetry only represents 64 bit long integers - no 32 bit ints
f64	Number data point (f64)	A 64 bit floating point number
f32	Number data point (f64)	OpenTelemetry only represents 64 bit double precision - no single precision floats.
statistic_set_measurement	Summary data point	Quantiles 0.0 and 1.0 are populated for min and max. Sum is approximate (over-shoots, computed from buckets). Count is exact.
histogram_measurement	Histogram data point	Delta temporality only. There is no sense in anything else for services.

Clients

Rust
Kotlin
JSON CLI: goodmetrics included in this release. send-metrics: receives json strings.
Prometheus: goodmetrics included in this release. poll-prometheus: avoid using prometheus when you have other choices.

JSON CLI

You can shove json into the goodmetrics application. You can pass repeated Datum blobs. For example:

goodmetrics send '{
  "metric":"mm",
  "unix_nanos":1642367609000000000,
  "dimensions":{
    "a_string_dimension":{"value":{"String":"asdf"}},
    "an_integer_dimension":{"value":{"Number":16}},
    "a_boolean_dimension":{"value":{"Boolean":true}}
  },
  "measurements":{
    "an_int_measurement":{"value":{"I32":42}},
    "a_long_measurement":{"value":{"I64":42}},
    "a_float_measurement":{"value":{"F32":42.42}},
    "a_double_measurement":{"value":{"F64":42.42}},
    "a_statistic_set":{"value":{"StatisticSet": {"minimum":1, "maximum":2, "samplesum":8, "samplecount":6}}},
    "a_histogram":{"value":{"Histogram":{"buckets":{"1":2, "3":4, "5":6}}}},
    "a_tdigest":{"value":{"Tdigest":{"sum":42.3,"count":42,"min":1,"max":1.3,"centroids":[{"mean":1,"weight":41},{"mean":1.3,"weight":1}]}}}
  }
}' '{
  "metric":"mm",
  "unix_nanos":1642367704000000000,
  "dimensions":{
    "a_string_dimension":{"value":{"String":"asdf"}},
    "an_integer_dimension":{"value":{"Number":16}},
    "a_boolean_dimension":{"value":{"Boolean":true}}
  },
  "measurements":{
    "an_int_measurement":{"value":{"I32":42}},
    "a_long_measurement":{"value":{"I64":42}},
    "a_float_measurement":{"value":{"F32":42.42}},
    "a_double_measurement":{"value":{"F64":42.42}},
    "a_statistic_set":{"value":{"StatisticSet": {"minimum":1, "maximum":2, "samplesum":8, "samplecount":6}}},
    "a_histogram":{"value":{"Histogram":{"buckets":{"1":2, "3":4, "5":6}}}},
    "a_tdigest":{"value":{"Tdigest":{"sum":42.3,"count":42,"min":1,"max":1.3,"centroids":[{"mean":1,"weight":41},{"mean":1.3,"weight":1}]}}}
  }
}'

Streaming from stdin will happen sooner or later.

Prometheus

If you have some legacy Prometheus component, you can poll it and send data to Timescale. For example, you might emit metrics from a node_exporter & poll it via goodmetrics poll-prometheus node. You'd then have tables in Timescale per metric with 1 row per host per metric & dimension position per poll interval, and a column per dimension/tag.

goodmetrics poll-prometheus --help
Poll prometheus metrics

USAGE:
    goodmetrics poll-prometheus [OPTIONS] <prefix> [poll-endpoint]

FLAGS:
    -h, --help       Prints help information
    -V, --version    Prints version information

OPTIONS:
        --bonus-dimensions <bonus-dimensions>     [default: {}]
            ex: '{"a_dimension_name": {"value": {"String": "a string dimension value"}} }'
        --interval-seconds <interval-seconds>     [default: 10]

ARGS:
    <prefix>
    <poll-endpoint>     [default: http://127.0.0.1:9100/metrics]

Prometheus -> Goodmetrics type mapping

Prometheus type	Goodmetrics type	about
counter	f64	All counters are treated as f64
gauge	f64	Gauges are just f64
untyped	f64	We just treat untyped like gauge
histogram	histogram	These are translated to sparse histograms from the bonkers prometheus histograms
summary	f64	Treated like gauges. These are awful and you should never use them if you can possibly use histograms instead

Example grafana query:

with top_10_by_net_out_bytes as (
  select hostname from node_net_out_bytes where time > now() - interval '3 hours'
  order by value desc
  limit 10
)
select time, hostname, value as cpu_utilization
from node_cpu_utilization
where
  time > now() - interval '3 hours'
  and hostname in (select hostname from top_10_by_net_out_bytes)
order by time

Pictures

Development

Both rustfmt and clippy are checked on PR. This repo currently treats all clippy lint violations as errors.

Add pre-commit hook:

Runs linters on commit to help you check in code that passes PR checks.

ln -s ../../git_hooks/pre-commit .git/hooks/pre-commit

Name		Name	Last commit message	Last commit date
Latest commit History 128 Commits
.cargo		.cargo
.github/workflows		.github/workflows
.vscode		.vscode
communication		communication
git_hooks		git_hooks
goodmetrics		goodmetrics
goodmetricsd		goodmetricsd
proto		proto
.dockerignore		.dockerignore
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Cross.toml		Cross.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
install.sh		install.sh
install_protoc.sh		install_protoc.sh
install_protoc_osx.sh		install_protoc_osx.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Goodmetrics

Overview

About

Getting started

TimescaleDB

Set up timescaledb if you are self-hosting

Run the server

Send metrics

Configurations

On healing

Data model

TimescaleDB Direct

OpenTelemetry (compatibility)

Clients

JSON CLI

Prometheus

Prometheus -> Goodmetrics type mapping

Pictures

Development

Add pre-commit hook:

About

Releases 14

Packages

Languages

License

kvc0/goodmetrics

Folders and files

Latest commit

History

Repository files navigation

Goodmetrics

Overview

About

Getting started

TimescaleDB

Set up timescaledb if you are self-hosting

Run the server

Send metrics

Configurations

On healing

Data model

TimescaleDB Direct

OpenTelemetry (compatibility)

Clients

JSON CLI

Prometheus

Prometheus -> Goodmetrics type mapping

Pictures

Development

Add pre-commit hook:

About

Resources

License

Stars

Watchers

Forks

Releases 14

Packages 0

Languages

Packages