Skip to content

Commit

Permalink
admin: Replace 'metrics' with 'prometheus-client' (#211)
Browse files Browse the repository at this point in the history
* admin: Replace 'metrics' with 'prometheus-client'

This change replaces the `metrics` feature with a `prometheus-client`
feature. prometheus-client is the official Prometheus-org
OpenMetrics implementation.

The watch-pods example has been updated to export prometheus metrics:

    :; curl localhost:8080/metrics
    # HELP watch_pods_events Number of events observed.
    # TYPE watch_pods_events counter
    watch_pods_events_total{op="apply"} 0
    watch_pods_events_total{op="delete"} 0
    watch_pods_events_total{op="restart"} 1
    # HELP watch_pods_current_pods Number of Pods being observed.
    # TYPE watch_pods_current_pods gauge
    watch_pods_current_pods 10
    # HELP watch_pods_pods Total number of unique pods observed.
    # TYPE watch_pods_pods counter
    watch_pods_pods_total 10
    # HELP process_start_time_seconds Time that the process started (in seconds since the UNIX epoch).
    # UNIT process_start_time_seconds seconds
    process_start_time_seconds 1702159727.213669
    # HELP process_uptime_seconds Total time since the process started (in seconds)
    # TYPE process_uptime_seconds counter
    # UNIT process_uptime_seconds seconds
    process_uptime_seconds_total 2.712104395
    # HELP process_cpu_seconds Total user and system CPU time spent in seconds
    # TYPE process_cpu_seconds counter
    # UNIT process_cpu_seconds seconds
    process_cpu_seconds_total 0.73
    # HELP process_virtual_memory_bytes Virtual memory size in bytes
    # TYPE process_virtual_memory_bytes gauge
    # UNIT process_virtual_memory_bytes bytes
    process_virtual_memory_bytes 1208897536
    # HELP process_resident_memory_bytes Resident memory size in bytes
    # TYPE process_resident_memory_bytes gauge
    # UNIT process_resident_memory_bytes bytes
    process_resident_memory_bytes 21114880
    # HELP process_open_fds Number of open file descriptors
    # TYPE process_open_fds gauge
    process_open_fds 16
    # HELP process_max_fds Maximum number of open file descriptors
    # TYPE process_max_fds gauge
    process_max_fds 1048576
    # HELP process_threads Number of OS threads in the process.
    # TYPE process_threads gauge
    process_threads 18
    # EOF

Furthermore, `admin::Builder` APIs are updated to consume the builder
and return it by-value (instead of handling references, which does not
work well when chaining configuration). This is based on the experience
of integrating prometheus into the example.

Additionally, the admin server has been updated to use spawn_blocking
for non-probe handlers.

To support this, a new kubert-prometheus-process crate has been added,
including only the procfs bindings needed to support the process
metrics. This crate is completely decoupled from kubert. It is based on
the process metrics implementation in the linkerd2-proxy repo.

* fixup ci

* Improve docs

* fixup! Improve docs
  • Loading branch information
olix0r authored Dec 9, 2023
1 parent 14a20aa commit a487ccd
Show file tree
Hide file tree
Showing 16 changed files with 657 additions and 159 deletions.
10 changes: 10 additions & 0 deletions .github/workflows/client.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,16 @@ env:
KUBERT_TEST_NS: kubert-test

jobs:
cleanup:
runs-on: ubuntu-latest
permissions:
actions: write
steps:
- uses: styfle/cancel-workflow-action@01ce38bf961b4e243a6342cbade0dbc8ba3f0432
with:
all_but_latest: true
access_token: ${{ github.token }}

local:
strategy:
matrix:
Expand Down
14 changes: 12 additions & 2 deletions .github/workflows/features.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,16 @@ env:
RUSTUP_MAX_RETRIES: 10

jobs:
cleanup:
runs-on: ubuntu-latest
permissions:
actions: write
steps:
- uses: styfle/cancel-workflow-action@01ce38bf961b4e243a6342cbade0dbc8ba3f0432
with:
all_but_latest: true
access_token: ${{ github.token }}

all-check:
strategy:
matrix:
Expand All @@ -40,7 +50,7 @@ jobs:
matrix:
feature:
- admin
- admin,metrics
- admin,prometheus-client
- client
- "client rustls-tls"
- "client openssl-tls"
Expand All @@ -50,7 +60,7 @@ jobs:
- initialized
- lease
- log
- metrics
- prometheus-client
- requeue
- runtime
- server
Expand Down
10 changes: 10 additions & 0 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,16 @@ env:
RUSTUP_MAX_RETRIES: 10

jobs:
cleanup:
runs-on: ubuntu-latest
permissions:
actions: write
steps:
- uses: styfle/cancel-workflow-action@01ce38bf961b4e243a6342cbade0dbc8ba3f0432
with:
all_but_latest: true
access_token: ${{ github.token }}

fmt:
timeout-minutes: 5
runs-on: ubuntu-latest
Expand Down
83 changes: 83 additions & 0 deletions .github/workflows/release-prometheus-process.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
name: Release kubernetes-prometheus-process

on:
pull_request:
paths:
- .github/workflows/release-prometheus-process.yml
push:
tags:
- 'kubert-prometheus-process/*'

env:
CARGO_INCREMENTAL: 0
CARGO_NET_RETRY: 10
RUSTUP_MAX_RETRIES: 10

permissions:
contents: read

jobs:
cleanup:
runs-on: ubuntu-latest
permissions:
actions: write
steps:
- uses: styfle/cancel-workflow-action@01ce38bf961b4e243a6342cbade0dbc8ba3f0432
with:
all_but_latest: true
access_token: ${{ github.token }}

meta:
timeout-minutes: 5
runs-on: ubuntu-latest
container: ghcr.io/linkerd/dev:v42-rust
steps:
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11
- id: meta
shell: bash
run: |
ref="${{ github.ref }}"
if [[ "$ref" == refs/tags/kubert-prometheus-process/* ]]; then
version="${ref##refs/tags/kubert-prometheus-process/}"
crate=$(just-cargo crate-version kubert-prometheus-process)
if [[ "$crate" != "$version" ]]; then
echo "::error ::Crate version $crate does not match tag $version" >&2
exit 1
fi
( echo version="$version"
echo mode=release
) >> "$GITHUB_OUTPUT"
else
sha="${{ github.sha }}"
( echo version="$(just-cargo crate-version kubert-prometheus-process)-git-${sha:0:7}"
echo mode=test
) >> "$GITHUB_OUTPUT"
fi
outputs:
mode: ${{ steps.meta.outputs.mode }}
version: ${{ steps.meta.outputs.version }}

release:
needs: [meta]
permissions:
contents: write
timeout-minutes: 5
runs-on: ubuntu-latest
steps:
- if: needs.meta.outputs.mode == 'release'
uses: softprops/action-gh-release@de2c0eb89ae2a093876385947365aca7b0e5f844
with:
name: ${{ needs.meta.outputs.version }}
generate_release_notes: true

crate:
# Only publish the crate after the rest of the release succeeds.
needs: [meta, release]
timeout-minutes: 10
runs-on: ubuntu-latest
container: ghcr.io/linkerd/dev:v42-rust
steps:
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11
- run: cargo publish --package=kubert-prometheus-process --dry-run
- if: needs.meta.outputs.mode == 'release'
run: cargo publish --package=kubert-prometheus-process --token=${{ secrets.CRATESIO_TOKEN }}
13 changes: 11 additions & 2 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: Release
name: Release kubert

on:
pull_request:
Expand All @@ -17,6 +17,16 @@ permissions:
contents: read

jobs:
cleanup:
runs-on: ubuntu-latest
permissions:
actions: write
steps:
- uses: styfle/cancel-workflow-action@01ce38bf961b4e243a6342cbade0dbc8ba3f0432
with:
all_but_latest: true
access_token: ${{ github.token }}

meta:
timeout-minutes: 5
runs-on: ubuntu-latest
Expand Down Expand Up @@ -47,7 +57,6 @@ jobs:
mode: ${{ steps.meta.outputs.mode }}
version: ${{ steps.meta.outputs.version }}

# Publish a GitHub release with platform-specific static binaries.
release:
needs: [meta]
permissions:
Expand Down
10 changes: 10 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,16 @@ env:
RUSTUP_MAX_RETRIES: 10

jobs:
cleanup:
runs-on: ubuntu-latest
permissions:
actions: write
steps:
- uses: styfle/cancel-workflow-action@01ce38bf961b4e243a6342cbade0dbc8ba3f0432
with:
all_but_latest: true
access_token: ${{ github.token }}

test:
timeout-minutes: 10
runs-on: ubuntu-latest
Expand Down
3 changes: 2 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
[workspace]
resolver = "2"
default-members = ["kubert"]
default-members = ["kubert", "kubert-prometheus-process"]
members = [
"kubert",
"kubert-prometheus-process",
"examples",
]
11 changes: 9 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Rust Kubernetes runtime helpers. Based on [`kube-rs`][krs].

* [`clap`](https://docs.rs/clap) command-line interface support;
* A basic admin server with `/ready` and `/live` probe endpoints;
* Optional Prometheus [`metrics`][mt] integration;
* Optional [`prometheus-client`][pc] integration;
* A default Kubernetes client;
* Graceful shutdown on `SIGTERM` or `SIGINT` signals;
* An HTTPS server (for admission controllers and API extensions) with
Expand Down Expand Up @@ -46,11 +46,18 @@ Other examples include:

* [Linkerd2 policy controller](https://github.com/linkerd/linkerd2/blob/d4543cd86e427b241ce961b50dd83b1738c0b069/policy-controller/src/main.rs)

## kubert-prometheus-process

The `kubert-prometheus-process` crate provides [process metrics][pm] for
prometheus-client. It has no dependencies on kubert, and can be used
independently.

## Status

This crate is still fairly experimental, though it's based on production code
from Linkerd; and we plan to use it in Linkerd moving forward.

[krs]: https://docs.rs/kube
[mt]: https://docs.rs/metrics
[pc]: https://docs.rs/prometheus-client
[pm]: https://prometheus.io/docs/instrumenting/writing_clientlibs/#process-metrics
[rt]: https://docs.rs/kubert/latest/kubert/runtime/struct.Runtime.html
17 changes: 6 additions & 11 deletions deny.toml
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,9 @@ exceptions = [
"Apache-2.0",
"Unicode-DFS-2016",
], name = "unicode-ident" },
{ allow = ["Zlib"], name = "adler32" },
{ allow = [
"Zlib",
], name = "adler32" },
]

[[licenses.clarify]]
Expand All @@ -52,7 +54,7 @@ skip = [
{ name = "base64" },
# the current versions of `hyper` and `tokio` depend on semver-incompatible
# versions of `socket2` (0.4 and 0.5, respectively).
{ name = "socket2" }
{ name = "socket2" },
]
skip-tree = [
{ name = "windows-sys" },
Expand All @@ -61,15 +63,8 @@ skip-tree = [
{ name = "windows_i686_msvc" },
{ name = "windows_x86_64_gnu" },
{ name = "windows_x86_64_msvc" },
# `metrics` and `kube-runtime` have conflicting versions of `ahash` and
# `wasi`.
# TODO(eliza): remove this skip when the conflicts are resolved.
{ name = "metrics" },
# `metrics-process` has transitive deps on `hermit-abi` and `bitflags` that
# are incompatible with the transitive deps other crates have on those
# libraries.
# TODO(eliza): remove this skip when the conflicts are resolved.
{ name = "metrics-process" },
# tracing-subscriber needs an older regex-automata.
{ name = "regex-automata" },
# `serde_json` and `serde_yaml` depend on incompatible versions of indexmap
{ name = "indexmap" },
# the proc-macro ecosystem is still in the process of migrating from `syn`
Expand Down
3 changes: 2 additions & 1 deletion examples/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ openssl-tls = ["kubert/openssl-tls", "openssl"]
[dependencies.kubert]
path = "../kubert"
default-features = false
features = ["clap", "lease", "runtime"]
features = ["clap", "lease", "prometheus-client", "runtime"]

[dependencies.openssl]
version = "0.10.57"
Expand All @@ -29,6 +29,7 @@ anyhow = "1"
chrono = { version = "0.4", default-features = false }
futures = { version = "0.3", default-features = false }
maplit = "1"
prometheus-client = "0.22"
rand = "0.8"
regex = "1"
thiserror = "1"
Expand Down
Loading

0 comments on commit a487ccd

Please sign in to comment.