-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Telemetry: introduce backlog_*
metrics, plus minor fixes
#2451
Comments
About 1., if a packet times out then it stays recorded in You can provoke a timeout with two chains running and this command :
|
After more investigations, I think that Any other case will lead to inconsistencies in the metrics. For example, if an error occurs and the packet is not relayed (account sequence mismatch) or the ack is never relayed or if a packet times out, the value is never removed from |
backlog_*
metrics, plus minor fixes
This is a follow-up to the work for #2408 and PR #2409.
Suggestions
We discussed yesterday with operators and here are some takeaways, also based on my observations so far:
Even after a channel is cleared,
oldest_*
metrics can remain to the same value (i.e., not reset to0
).0
I think we should clarify what the
oldest_timestamp
is, it seems this field is a local timestamp to the Hermes process, not an on-chain packet timestamp (when the packet was created), which is not clear from the telemetry help message, specifically:Let's rename
oldest_*
metrics tobacklog_*
and additionally:backlog_size
metric to capture the number of pending packets in a channel.The text was updated successfully, but these errors were encountered: