holepunching: expose metrics #2103

marten-seemann · 2023-02-15T06:02:36Z

We should:

count hole punching attempts and outcomes (by transport)
differentiate between connection reversal and actual hole punches

@dennis-tra Are there any more Prometheus metrics you think would be helpful to have on a dashboard?

dennis-tra · 2023-02-15T08:22:41Z

Just brainstorming, not saying we should track all of the following:

track hole punch outcomes (this includes connection reversal). In our measurement campaign we tracked the following outcomes:
- SUCCESS - any attempt succeeded
- FAILED - all attempts failed
- CONNECTION_REVERSED - dcutr stream wasn't opened but we still ended up with a direct connection
- NO_STREAM - remote peer supported dcutr but didn't open a stream.
I can imagine that the last two are more difficult to track than the first two.
Each outcome could be enriched with the following information:
- number of required attempts
- used transport
- used IP version
- remote peers agent version - to avoid cardinality explosion we could limit it kubo/non-kubo + kubo minor version. This could help in spotting regressions. On the other hand, since libp2p is a generic library I would rather not couple it to Kubo - perhaps make it configurable.
- Information about the local network
  - has active port mapping (yes/no)
  - reachability status (unknown, public, private)
  - is in a VPN (not sure how to track that)
- somehow also keep track of the measured RTTs

I think the following would make a great start:

hole punch outcome SUCCESS, FAILED, CONNECTION_REVERSED - NO_STREAM would only make sense if we also tracked the remote peers agent version
used transport
used IP version (because our measurement suggested that IPv6 doesn't work well)
number of required attempts

sukunrt · 2023-03-20T05:42:42Z

would only make sense if we also tracked the remote peers agent version

@dennis-tra do you think this info is not useful without the peers agent version?

dennis-tra · 2023-03-20T11:11:04Z

@sukunrt I think what I wanted to say is that we wouldn't be able to act on e.g., a degraded success rate because we didn't correlate it with something else. "Something else" could be the agent version (as I mentioned above), but it could also be another property like hasActivePortMapping or isCGNAT'd, etc.

But I would step back from my above statement because it would still be useful to know if the distribution of hole punch outcomes changes over time.

sukunrt · 2023-03-22T06:46:46Z

tracking agent version is a difficult since it's a user controlled field, so an attacker can generate lots of agent versions which would lead to creation of a lot of prometheus time series on our end.
@marten-seemann has some ideas around a useful construct where we allow applications to specify a regex. However we can take that up after we finish the initial set of metrics.

I propose the following set of metrics in the first cut:

Holepunch Attempt by (ipversion, transport)
- We will only count a (ipv, transport) combination if both the sides provide an address with that combination.
Holepunch Outcome by (ipversion, transport)
- Same condition as holepunch attempt for considering an (ipv, transport) combination
Holepunch Outcome by (number of attemps)
- only measured on the dcutr initiator side
Holepunch Errors by reason
- reasons for not trying holepunching. malformed message, no public address etc.
ConnectionReversed:
- Tracks whether attempt to direct dial before holepunching succeeds.

marten-seemann mentioned this issue Feb 15, 2023

Exposing Metrics #1356

Open

16 tasks

p-shahi assigned sukunrt Mar 17, 2023

sukunrt mentioned this issue Mar 27, 2023

holepunch: add metrics #2212

Closed

sukunrt mentioned this issue Apr 6, 2023

holepunch: add metrics #2246

Merged

p-shahi mentioned this issue May 1, 2023

identify metrics: track agent and protocol version #2182

Closed

marten-seemann closed this as completed in #2246 Jun 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

holepunching: expose metrics #2103

holepunching: expose metrics #2103

marten-seemann commented Feb 15, 2023

dennis-tra commented Feb 15, 2023

sukunrt commented Mar 20, 2023 •

edited

Loading

dennis-tra commented Mar 20, 2023

sukunrt commented Mar 22, 2023 •

edited

Loading

holepunching: expose metrics #2103

holepunching: expose metrics #2103

Comments

marten-seemann commented Feb 15, 2023

dennis-tra commented Feb 15, 2023

sukunrt commented Mar 20, 2023 • edited Loading

dennis-tra commented Mar 20, 2023

sukunrt commented Mar 22, 2023 • edited Loading

sukunrt commented Mar 20, 2023 •

edited

Loading

sukunrt commented Mar 22, 2023 •

edited

Loading