-
Notifications
You must be signed in to change notification settings - Fork 668
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Prometheus Block Fullness Metrics #3025
Conversation
src/monitoring/prometheus.rs
Outdated
@@ -91,6 +91,31 @@ lazy_static! { | |||
"Total number of error logs emitted by node" | |||
)).unwrap(); | |||
|
|||
pub static ref LAST_EXECUTION_READ_COUNT: IntGauge = register_int_gauge!(opts!( | |||
"execution_cost_read_count", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these could be slightly more descriptively named for prometheus as last_block_read_count
, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the variable name or the caption part?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the prometheus metric name, i.e.
pub static ref LAST_EXECUTION_READ_COUNT: IntGauge = register_int_gauge!(opts!(
"last_block_read_count",
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also prepend stacks_node_
to each metric name like we do for the other stacks-node metrics? It helps with search and discovery, especially when your Prometheus deployment could have hundreds of metrics from all kinds of services. Reference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Went with stacks_node_last_block_read_count
-style prefix.
Codecov Report
@@ Coverage Diff @@
## develop #3025 +/- ##
===========================================
+ Coverage 82.61% 82.64% +0.02%
===========================================
Files 242 242
Lines 195667 195691 +24
===========================================
+ Hits 161642 161720 +78
+ Misses 34025 33971 -54
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it is failing to build with the prometheus features enabled.
You can test the build locally by enabling the monitoring_prom
feature specifically, or by running:
$ cargo check --all-features
LGTM! Thanks @gregorycoppola ! |
execution_cost: &ExecutionCost, | ||
block_limit: &ExecutionCost, | ||
) { | ||
#[cfg(feature = "monitoring_prom")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to just group all of these statements inside a single #[cfg(feature = "monitoring_prom")]
? Something like:
#[cfg(feature = "monitoring_prom")] {
/* do stuff here */
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just a small style nit.
src/chainstate/stacks/db/blocks.rs
Outdated
@@ -5002,6 +5003,10 @@ impl StacksChainState { | |||
None, | |||
)?; | |||
|
|||
let block_limit = clarity_tx | |||
.block_limit() | |||
.unwrap_or(ExecutionCost::max_value()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if setting block_limit
to ExecutionCost::max_value()
in the case of an error makes sense...
Could we at least add a warn if there is an error here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm... what would it mean if block_limit()
failed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would mean that a transaction was currently being evaluated. I don't think it's possible in this code path, but it'd be good to warn --
.unwrap_or(ExecutionCost::max_value()); | |
.unwrap_or_else(|| { | |
warn!("Failed to read transaction block limit"); | |
ExecutionCost::max_value() | |
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reviews, everyone. |
Motivation
We want to log a bunch of new metrics (#3024). This one adds "block fullness", which we have been talking about lately.
This supports ingestion of this data into prometheus without needing to launch a custom server.
Change
This PR logs the 5-dimensional
ExecutionCost
for the just seen block in theappend_block
function run by the followers and miners.Testing
No testing yet! Since this is for diagnostics, we can hook it up to the prometheus monitoring system and see if it looks right.