Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metrics: switch to using prometheus library #8741

Merged
merged 2 commits into from
Nov 16, 2023
Merged

Conversation

taratorio
Copy link
Member

@taratorio taratorio commented Nov 16, 2023

Background

Erigon currently uses a combination of Victoria Metrics and Prometheus client for providing metrics.

We want to rationalize this and use only the Prometheus client library, but we want to maintain the simplified Victoria Metrics methods for constructing metrics.

This task is currently partly complete and needs to be finished to a stage where we can remove the Victoria Metrics module from the Erigon code base.

Tests

Functional

  • Make sure that the format change int->float implied by VM to Prometheus does not impact clients (pay particular attention to block numbers)
  • Check that the prometheus/grafana dashboards defined in cmd/prometheus are functional after the change
    (see docker-compose.yml for details and https://github.com/ledgerwatch/erigon/tree/devel/cmd/prometheus#readme)
  • Confirm that the underlying go metrics are still generated
  • Confirm the following flags setting work:
    --metrics, --metrics.addr, --metrics.port with the new code
  • Confirm that --metrics and --proff settings and handlers configuration still allow metrics and pprof to share a port

Float counters - scientific notation test case

Screenshot_2023-11-07_at_15 57 21

Screenshot 2023-11-15 at 16 26 56

Float counters - NaN test case

Screenshot_2023-11-07_at_16 04 25

Screenshot 2023-11-15 at 16 28 36

Performance

  • Check the performance of counters created by RPC calls measurements created by rpc/metrics.go are not impacted by the change.

RPC

Performed tests on rpcdaemon & erigon on localhost using etc_blockNumber.
Did tests with 100, 1000, 10000 requests. Got a steady 15 ms response time.

Memory

Screenshot 2023-11-16 at 09 58 39

@taratorio taratorio merged commit 27d8865 into devel Nov 16, 2023
7 checks passed
@taratorio taratorio deleted the use-prom-metrics branch November 16, 2023 16:30
bgelb pushed a commit to bgelb/erigon that referenced this pull request Nov 17, 2023
# Background

Erigon currently uses a combination of Victoria Metrics and Prometheus
client for providing metrics.

We want to rationalize this and use only the Prometheus client library,
but we want to maintain the simplified Victoria Metrics methods for
constructing metrics.

This task is currently partly complete and needs to be finished to a
stage where we can remove the Victoria Metrics module from the Erigon
code base.

## Tests

### Functional
* Make sure that the format change int->float implied by VM to
Prometheus does not impact clients (pay particular attention to block
numbers)
* Check that the prometheus/grafana dashboards defined in cmd/prometheus
are functional after the change
(see docker-compose.yml for details and
https://github.com/ledgerwatch/erigon/tree/devel/cmd/prometheus#readme)
* Confirm that the underlying go metrics are still generated
* Confirm the following flags setting work:
    --metrics, --metrics.addr, --metrics.port with the new code
* Confirm that --metrics and --proff settings and handlers configuration
still allow metrics and pprof to share a port

#### Float counters - scientific notation test case
![Screenshot_2023-11-07_at_15 57
21](https://github.com/ledgerwatch/erigon/assets/94537774/32f0a6f6-968b-477c-8ec8-bb1812f3e848)

![Screenshot 2023-11-15 at 16 26
56](https://github.com/ledgerwatch/erigon/assets/94537774/3f402b2e-e343-4928-9fbb-18fa4d077485)


#### Float counters - NaN test case
![Screenshot_2023-11-07_at_16 04
25](https://github.com/ledgerwatch/erigon/assets/94537774/cbf90d5d-3749-4bd7-971d-e2124e54267c)

![Screenshot 2023-11-15 at 16 28
36](https://github.com/ledgerwatch/erigon/assets/94537774/5924915e-1977-4b7f-8082-23f73d0957d5)

### Performance
* Check the performance of counters created by RPC calls measurements
created by rpc/metrics.go are not impacted by the change.

#### RPC
Performed tests on rpcdaemon & erigon on localhost using
`etc_blockNumber`.
Did tests with 100, 1000, 10000 requests. Got a steady 15 ms response
time.

#### Memory
![Screenshot 2023-11-16 at 09 58
39](https://github.com/ledgerwatch/erigon/assets/94537774/5dd956d7-903f-4bea-a460-d3644da56201)
bgelb pushed a commit to bgelb/erigon that referenced this pull request Nov 18, 2023
# Background

Erigon currently uses a combination of Victoria Metrics and Prometheus
client for providing metrics.

We want to rationalize this and use only the Prometheus client library,
but we want to maintain the simplified Victoria Metrics methods for
constructing metrics.

This task is currently partly complete and needs to be finished to a
stage where we can remove the Victoria Metrics module from the Erigon
code base.

## Tests

### Functional
* Make sure that the format change int->float implied by VM to
Prometheus does not impact clients (pay particular attention to block
numbers)
* Check that the prometheus/grafana dashboards defined in cmd/prometheus
are functional after the change
(see docker-compose.yml for details and
https://github.com/ledgerwatch/erigon/tree/devel/cmd/prometheus#readme)
* Confirm that the underlying go metrics are still generated
* Confirm the following flags setting work:
    --metrics, --metrics.addr, --metrics.port with the new code
* Confirm that --metrics and --proff settings and handlers configuration
still allow metrics and pprof to share a port

#### Float counters - scientific notation test case
![Screenshot_2023-11-07_at_15 57
21](https://github.com/ledgerwatch/erigon/assets/94537774/32f0a6f6-968b-477c-8ec8-bb1812f3e848)

![Screenshot 2023-11-15 at 16 26
56](https://github.com/ledgerwatch/erigon/assets/94537774/3f402b2e-e343-4928-9fbb-18fa4d077485)


#### Float counters - NaN test case
![Screenshot_2023-11-07_at_16 04
25](https://github.com/ledgerwatch/erigon/assets/94537774/cbf90d5d-3749-4bd7-971d-e2124e54267c)

![Screenshot 2023-11-15 at 16 28
36](https://github.com/ledgerwatch/erigon/assets/94537774/5924915e-1977-4b7f-8082-23f73d0957d5)

### Performance
* Check the performance of counters created by RPC calls measurements
created by rpc/metrics.go are not impacted by the change.

#### RPC
Performed tests on rpcdaemon & erigon on localhost using
`etc_blockNumber`.
Did tests with 100, 1000, 10000 requests. Got a steady 15 ms response
time.

#### Memory
![Screenshot 2023-11-16 at 09 58
39](https://github.com/ledgerwatch/erigon/assets/94537774/5dd956d7-903f-4bea-a460-d3644da56201)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant