Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more metrics #2310

Merged
merged 62 commits into from
Oct 14, 2024
Merged

Add more metrics #2310

merged 62 commits into from
Oct 14, 2024

Conversation

rafal-ch
Copy link
Contributor

@rafal-ch rafal-ch commented Oct 7, 2024

Closes #807

Description

This PR adds a couple of additional metrics (block importer and p2p) and also contains slight refactor of how we initialize bucket sized for histogram based metrics.

The metrics command-line parameter has been replaced with disable-metrics. Metrics are now enabled by default, with the option to disable them entirely or on a per-module basis. This change is breaking for all dependencies that use CLI to setup the fuel-core-client

      --disable-metrics <METRICS>
          Disables all metrics, or specify a comma-separated list of modules to disable metrics for specific ones. Available options: importer, p2p, producer, txpool, graphql
          
          [env: METRICS=]
          [default: ]

Startup logs also show the metrics config:

2024-10-14T20:17:37.536840Z  INFO fuel_core_bin::cli::run: 308: Metrics config: Disable modules: txpool

Checklist

  • Breaking changes are clearly marked as such in the PR description and changelog
  • New behavior is reflected in tests

Before requesting review

  • I have reviewed the code myself

@rafal-ch rafal-ch changed the title Add gas_per_block metric Add more metrics Oct 7, 2024
Comment on lines 234 to 261
let mut swarm = if metrics {
// we use the global registry to store the metrics without needing to create a new one
// since libp2p already creates sub-registries
let mut registry = global_registry().registry.lock();

swarm_builder
.with_bandwidth_metrics(&mut registry)
.with_behaviour(|_| behaviour)?
.with_swarm_config(|cfg| {
if let Some(timeout) = config.connection_idle_timeout {
cfg.with_idle_connection_timeout(timeout)
} else {
cfg
}
})
.build()
} else {
swarm_builder
.with_behaviour(|_| behaviour)?
.with_swarm_config(|cfg| {
if let Some(timeout) = config.connection_idle_timeout {
cfg.with_idle_connection_timeout(timeout)
} else {
cfg
}
})
.build()
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

libp2p generics prevent me from doing something like this -

let swarm_builder = initial_stuff;
let swarm builder = if metrics {
  let mut registry = ..;
  swarm_builder.with_bandwidth_metrics(&mut registry)
} else {
  swarm_builder
}
.. rest of the stuff with_behavior etc

Copy link
Contributor

@acerone85 acerone85 Oct 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, I have tried the same and eventually gave up. There is a function with_no_bandwidth_metrics() that could help here, but it is implemented for a different phase that can be reached only via private functions :(

@rafal-ch rafal-ch closed this Oct 8, 2024
@rafal-ch rafal-ch reopened this Oct 8, 2024
AurelienFT
AurelienFT previously approved these changes Oct 14, 2024
Copy link
Contributor

@AurelienFT AurelienFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM very little nit

crates/metrics/src/config.rs Outdated Show resolved Hide resolved
rymnc
rymnc previously approved these changes Oct 14, 2024
Cargo.lock Show resolved Hide resolved
acerone85
acerone85 previously approved these changes Oct 14, 2024
@xgreenx xgreenx dismissed stale reviews from acerone85 and rymnc via 8c0cccc October 14, 2024 21:15
xgreenx
xgreenx previously approved these changes Oct 14, 2024
rymnc
rymnc previously approved these changes Oct 14, 2024
@xgreenx xgreenx dismissed stale reviews from rymnc and themself via 5aea57a October 14, 2024 22:07
@xgreenx xgreenx enabled auto-merge (squash) October 14, 2024 22:09
@xgreenx xgreenx merged commit ada0e9e into master Oct 14, 2024
35 of 36 checks passed
@xgreenx xgreenx deleted the 807_add_more_metrics branch October 14, 2024 22:24
@xgreenx xgreenx mentioned this pull request Oct 14, 2024
xgreenx added a commit that referenced this pull request Oct 14, 2024
## Version v0.40.0

### Added
- [2347](#2347): Add GraphQL
complexity histogram to metrics.
- [2350](#2350): Added a new
CLI flag `graphql-number-of-threads` to limit the number of threads used
by the GraphQL service. The default value is `2`, `0` enables the old
behavior.
- [2335](#2335): Added CLI
arguments for configuring GraphQL query costs.

### Fixed
- [2345](#2345): In PoA
increase priority of block creation timer trigger compare to txpool
event management

### Changed
- [2334](#2334): Prepare the
GraphQL service for the switching to `async` methods.
- [2310](#2310): New metrics:
"The gas prices used in a block" (`importer_gas_price_for_block`), "The
total gas used in a block" (`importer_gas_per_block`), "The total fee
(gwei) paid by transactions in a block" (`importer_fee_per_block_gwei`),
"The total number of transactions in a block"
(`importer_transactions_per_block`), P2P metrics for swarm and protocol.
- [2340](#2340): Avoid long
heavy tasks in the GraphQL service by splitting work into batches.
- [2341](#2341): Updated all
pagination queries to work with the async stream instead of the sync
iterator.
- [2350](#2350): Limited the
number of threads used by the GraphQL service.

#### Breaking
- [2310](#2310): The `metrics`
command-line parameter has been replaced with `disable-metrics`. Metrics
are now enabled by default, with the option to disable them entirely or
on a per-module basis.
- [2341](#2341): The maximum
number of processed coins from the `coins_to_spend` query is limited to
`max_inputs`.

## What's Changed
* fix(gas_price_service): service name and unused trait impl by @rymnc
in #2317
* Do not require build of docker images to pass CI by @xgreenx in
#2342
* Prepare the GraphQL service for the switching to `async` methods by
@xgreenx in #2334
* Limited the number of threads used by the GraphQL service by @xgreenx
in #2350
* Increase priority of timer over txpool event by @xgreenx in
#2345
* Disable flaky `test_poa_multiple_producers` test by @rafal-ch in
#2353
* feat: CLI arguments for configuring GraphQL query costs. by @netrome
in #2335
* Add graphql query complexity histogram metric by @AurelienFT in
#2349
* Updated all pagination queries to work with the `Stream` instead of
`Iterator` by @xgreenx in
#2341
* Avoid long heavy tasks in the GraphQL service by @xgreenx in
#2340
* Add more metrics by @rafal-ch in
#2310


**Full Changelog**:
v0.39.0...v0.40.0

---------

Co-authored-by: Rafał Chabowski <[email protected]>
Co-authored-by: acerone85 <[email protected]>
Co-authored-by: rymnc <[email protected]>
Co-authored-by: Rafał Chabowski <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking A breaking api change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants