Skip to content
This repository has been archived by the owner on Dec 29, 2020. It is now read-only.

Prometheus remote write adapter for PostgreSQL

License

Notifications You must be signed in to change notification settings

ssube/prometheus-sql-adapter

Repository files navigation

NOTICE: THIS PROJECT HAS BEEN SUPERSEDED BY PROMSCALE

This project has been superseded by Promscale. Like this project, Promscale allows easy storage of Prometheus metrics in TimescaleDB + Postgres, but also offers: automatic partitioning, native compression (typically 95% storage savings), native data retention policies, full SQL and PromQL, and more.

You can find the new project at https://github.com/timescale/promscale and more details can be found in the design document.

This project will continue only in maintenance mode.

Prometheus SQL Adapter

Adapter to connect Prometheus' remote write endpoint to a PostgreSQL server, preferably running TimescaleDB. Caches labels for each timeseries to reduce writes, linking them to samples by metric fingerprint.

This adapter was inspired by the Timescale PostgreSQL adapter and maintains a compatible schema, so queries may be used with either, but this adapter does not require the pg_prometheus extension, making it compatible with Aurora PostgreSQL, Azure Database for PostgreSQL, and other managed PostgreSQL services.

While it is possible to use this adapter and most of the schema without TimescaleDB, it will become difficult to prune older data, compression will not be available, and queries will be slower. If you can use TimescaleDB, please do.

Features

  • batteries included
    • example queries for alerting, reports, and schema metadata
    • Grafana dashboards for Kubernetes workloads, hardware metrics, and schema metadata
    • Jupyter notebooks for long-term reporting
    • PostgreSQL server image with schema and grant setup scripts
    • Prometheus rules for derived metrics
  • compatible schema
    • query compatible with Timescale's official pg_prometheus schema
    • does not require pg_prometheus extension
    • does not require superuser or extension privileges
  • efficient storage
    • hashed & cached label IDs
    • samples in compressed hypertable
    • uses Go SQL and bulk copy for samples

Getting Started

  • create the schema:
    • deploy kubernetes/server.yml
    • or docker run --rm -p 5432:5432 ssube/prometheus-sql-adapter:master-postgres-11 -c 'shared_preload_libraries=timescaledb'
    • or run ./scripts/schema-create [license-level] [retain-live] [retain-total] against an existing database
  • configure adapters:
    • create a role for each set of adapters to write
    • run ./scripts/schema-grant.sh [role-name] adapter
    • deploy kubernetes/adapter.yml
  • configure Grafana:
    • create a role for each Grafana instance to read
    • run ./scripts/schema-grant.sh [role-name] grafana
    • add a Postgres data source
    • import dashboards from grafana/
  • configure humans:
    • create a role for each developer to read
    • run ./scripts/schema-grant.sh [role-name] human
    • show off your sweet new graphs

The schema scripts are idempotent and safe to run repeatedly, including schema-create.sh.

Non-breaking upgrades can be performed by running the schema scripts again, in the same order.

Contents

Status

Pipeline status Test coverage MIT license

Open bug count Open issue count Closed issue count

Renovate badge Dependency status Dev dependency status Known vulnerabilities

Maintainability Technical debt ratio Quality issues

Releases

github release link github release version github commits since release

docker image link docker image size

Schema

This schema is compatible with the Timescale pg_prometheus adapter but does not require the pg_prometheus extension or SUPERUSER permissions.

Captured labels and samples are split into two tables, with labels stored uniquely and identified by their FNV-1a hash, to which samples are tied.

Labels are stored once for each timeseries using the metric's hashed fingerprint, or label ID (lid). This is provided by the Prometheus SDK and uses the 64-bit FNV-1a hash, which is then stored as a UUID column. The remote write adapters each maintain an LRU cache of label IDs, and can be configured not to rewrite labels.

Using the metric's fingerprint provides a short, deterministic identifier for each label set, or timeseries. The adapters do not need to coordinate and can safely write in parallel, using an ON CONFLICT clause to skip or update existing label sets. While a numeric counter might be shorter than the current hash-as-UUID, it would require coordination between the adapters or within the database. The hashed lid avoids lookups when writing to an existing timeseries.

Maximum time ranges may be enforced by the metrics view to limit the amount of raw data that can be fetched at once, but deduplication and aggregation typically need more context to determine the correct operators, and must happen later. The maximum range allowed by metrics is used for the compression policy as well, compacting data when it can no longer be queried directly. Finally, the view makes this schema compatible with the original pg_prometheus schema and the v0.1 schema (which featured a single metrics table with both value and labels).

Views

Views within the schema are split into three groups: aggregated samples, materialized catalogs of the schema, and compatibility with the pg_prometheus schema.

Aggregate views are prefixed with agg_ and use TimescaleDB's continuous aggregates to occasionally refresh a materialized view and aggregate samples into larger time buckets.

Catalog views are prefixed with cat_ and materialize expensive views of the metric labels, enriching them with collected metadata.

Compatibility views ensure the schema is fully compatible with the pg_prometheus extensions' schema, despite slightly different underlying storage.