Collect & report metrics #183

ch1bo · 2022-01-30T17:22:46Z

What & Why

To measure success (or failure) of the Hydra Head project and improve continuously, we need to know how many Hydra Heads are opened, how long they are used, how many UTXOs are moved into / out of a Head etc. Most of this information is publicly available and can be derived by observing the main-chain. The remainder (e.g. transactions sizes & number of UTXOs in a Head), will be collected from within the hydra-node and will be opt-out once we reach mainnet maturity.

TBD

Detail what we want to collect
Scope out reporting infrastructure
Is this a also useful to implement watch-tower functionality and thus make custodial Hydra Heads more "trustworthy" when they provide this telemetry to their users (or watchtowers)?

Tasks

Chain observer tracking Head transactions and aggregating Head information
Explorer service using Head information

The text was updated successfully, but these errors were encountered:

ghost · 2022-02-25T11:49:02Z

With a stateless "chain observer" available, we could host a simple "Hydra Head Explorer" service online that would show and track the state of heads running on some chain?

ghost · 2022-04-07T12:36:31Z

Couple of basic ideas:

What are interesting metrics to collect off-chain?
We already publish prometheus metrics inside the hydra-node, we could simply add a sidecar that scrapes it and send data to a public grafana cloud instance
Other part could be handled by observing the chain

ghost · 2022-04-26T07:18:21Z

I have setup and used jaeger and zipkin in the past, including inside Haskell apps and having a way to track the processing of user requests across a distributed system is invaluable to understand its behaviour.

Looking at https://github.com/ethercrow/opentelemetry-haskell which provides support for traces. Someone pointed me at https://opentelemetry.io/docs/concepts/data-collection/ which provides a conceptual framework for all kind of "observability" data collection. In particular, opentelemetry (used to be called openjaeger) defines some standards to provide interoperability between various kind of services, allowing for example to collect and export Prometheus metrics, logs and traces to some other service.

We currently expose the following metrics in the node:

number of events
number of requested txs
number of confirmed txs
tx confirmation time histogram

Handling and possibly tuning of snapshots size is important for the protocol so we should add:

number of snapshots
number of tx/snapshot
snapshot confirmation time

Also:

event queue length, to track possible congestions/loopholes
system-level resources (CPU, RAM, Network traffic)
number UTxO in internal ledger

Traces could be an interesting addition to analyse the trace generated by a NewTx coming from a client and how it spreads across the network until the transaction becomes confirmed. This would be helpful in particular to understand the behaviour of the network if/when we move away from fully connected network to something more dynamic or less densely connected, with routing between the nodes. Not sure if it's worthwhile to do it now though.

Tasks for this feature:

setup a central collection host/system with authenticated access
deploy an opentelemetry sidecar instead of a prometheus server within hydra stack
configure opentelemetry to send metrics to central host (with certificate)
opting out simply means not deploying the sidecar

ch1bo added the 💬 feature A feature on our roadmap label Jan 30, 2022

ch1bo added this to the Testnet maturity milestone Jan 30, 2022

ch1bo added the green 💚 Low complexity or well understood feature label Feb 3, 2022

ch1bo removed this from the Testnet maturity milestone Mar 8, 2022

ghost mentioned this issue Mar 23, 2022

Add analytics to hydra.family website #285

Closed

ch1bo added this to the 0.5.0 milestone Apr 19, 2022

ch1bo removed this from the 0.5.0 milestone Apr 26, 2022

ch1bo added the help wanted Issues where we could need some help label May 3, 2022

ch1bo added 💭 idea An idea or feature request and removed 💬 feature A feature on our roadmap help wanted Issues where we could need some help green 💚 Low complexity or well understood feature labels Mar 21, 2023

ch1bo mentioned this issue Jun 14, 2024

Report Hydra head transaction history when closing head. #1467

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Collect & report metrics #183

Collect & report metrics #183

ch1bo commented Jan 30, 2022 •

edited

Loading

ghost commented Feb 25, 2022

ghost commented Apr 7, 2022

ghost commented Apr 26, 2022 •

edited by ch1bo

Loading

Collect & report metrics #183

Collect & report metrics #183

Comments

ch1bo commented Jan 30, 2022 • edited Loading

What & Why

TBD

Tasks

ghost commented Feb 25, 2022

ghost commented Apr 7, 2022

ghost commented Apr 26, 2022 • edited by ch1bo Loading

ch1bo commented Jan 30, 2022 •

edited

Loading

ghost commented Apr 26, 2022 •

edited by ch1bo

Loading