Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics for 0.5 release #7059

Closed
2 of 7 tasks
momack2 opened this issue Mar 30, 2020 · 7 comments
Closed
2 of 7 tasks

Metrics for 0.5 release #7059

momack2 opened this issue Mar 30, 2020 · 7 comments

Comments

@momack2
Copy link
Contributor

momack2 commented Mar 30, 2020

We need metrics for the 0.5 release to give IPFS current and potential users an estimate of the performance improvements they should see in this release from the various improvements included. These metrics should be as realistic and repeatable as possible since we are continuing to merge additional patches and improvements over time. This is particularly hard to do since combining features together (DHT changes + hydra + "% of network upgraded") produce very different metrics. Therefore, I suggest we mostly narrow in on "fully upgraded" network scenarios to limit the complexity of our testing.

I'd like us to define repeatable tests to validate/benchmark the major performance improvements we see in this release:

  • Adding: Based on the data in these tests we're seeing about a 2x improvement in badger (which is overall 10-25x faster than using flatfs)
  • Fetching: For the benchmarks in the bitswap blog post, we cut transfer time in half (2x faster)
  • Providing: Early metrics show this will be anywhere from 15% faster to many times faster, but estimates vary and I don't think we have a repeatable testground test benchmarking this yet. (Ex, a 1k node network with benchmarks for time to first provide, time to provide completion, success ratio, etc for both our new DHT code (Cypress) and the previous DHT code (Balsa))
  • Finding: Time to find relevant content is the most influenced by differences in network formation and the combination of different features (ex hydra-boosters and % of network upgraded). While metrics during the transition period will be hard to predict, we should be able to use testground to benchmark the expected performance of a 1K node network once the majority of the network has upgraded and compare that to Balsa. This won't be fully representative (using simulations vs the live network), but will be more representative of the upgraded state than testing against a non-upgraded network.

Other useful tests:

  • Time to IPNS resolve
  • Time to IPNS announce
  • Default bandwidth used as a DHT server in Balsa vs Cypress

@alanshaw @aschmahmann @Stebalien @petar - are there other major aspects we need to benchmark or include in these tests?

@aschmahmann
Copy link
Contributor

the combination of different features (ex hydra-boosters and % of network upgraded).

@momack2 how important is getting metrics/benchmarks that assume the existence of hydra nodes? We can currently setup tests that are mixed network, Balsa-only and Cypress-only, however, we have yet to do one that includes any hydra-like behavior.

include in these tests

It might be useful to compare IPNS over PubSub resolution speed and update speeds to standard IPNS. However, given that that it's already existed in prior releases (albeit not independently from the DHT and less robust) if it will still be behind an experimental flag, which my understanding is that it will, then we may not need to collect these metrics now.

Do we need to include metrics for FindPeer DHT requests? While it's not really "core IPFS" functionality (my understanding with no bearing whatsoever on project goals and direction) people do use ipfs swarm connect to try and find other machines on the network. For example, I use this as a Dynamic DNS replacement.

@Stebalien
Copy link
Member

Early metrics show this will be anywhere from 15% faster to many times faster, but estimates vary and I don't think we have a repeatable testground test benchmarking this yet.

All of our early metrics here are incorrect. They're helping us improve our query logic and helping us improve our metrics, but I don't expect the current metrics to look anything like the final metrics.

how important is getting metrics/benchmarks that assume the existence of hydra nodes? We can currently setup tests that are mixed network, Balsa-only and Cypress-only, however, we have yet to do one that includes any hydra-like behavior.

I believe the comments on hydra are simply calling out the fact that any metrics we get here won't fully reflect reality, but we should do our best.

The core ask is:

While metrics during the transition period will be hard to predict, we should be able to use testground to benchmark the expected performance of a 1K node network once the majority of the network has upgraded and compare that to Balsa.

The fact that this test won't include hydra is juts a caveat.

@momack2
Copy link
Contributor Author

momack2 commented Apr 1, 2020

@Stebalien is correct - core ask is repeatable metrics that don't include Hydra. SEPARATELY - Hydra should validate/prove through testing that these boosters improve network performance.

@aschmahmann - I think those metrics would be really nice (so we have a datapoint on the performance gains of the new design), but lower priority than testing/releasing the DHT fixes. Aka, let's either defer until after the RC is cut, or delegate to another owner?

If we expect swarm connect to be measurably different due to our DHT work and affect a common user flow, we should have a test/benchmark that exhibits the known change.

@momack2
Copy link
Contributor Author

momack2 commented Apr 15, 2020

Hey Folks - do we have an update on these metrics? @aschmahmann - is the testground graph you made still accurate or do we expect those estimates to change significantly?

Would love to have a two different graph for public consumption (ex using labels like "0.5" instead of Cyprus) that shows benchmarks for each of finding & providing for nodes that upgrade (talking to other upgraded nodes vs old nodes) compared with current benchmarks of 0.4.23 and before.

@aschmahmann
Copy link
Contributor

aschmahmann commented Apr 17, 2020

@momack2 I thought for simplicity we were only putting together graphs for testground tests a fully v0.4.23 network and for a fully v0.5.0 network, but not for a mixed network.

We have a testground test ready to go for a mixed network, but @jacobheun thought the nuances and assumptions in the tests might be a bit more complicated to explain. The TLDR is that performance gets better for everyone once we switch the network over to the new protocol version, but go-ipfs v0.5.0 nodes have a larger performance benefit. Do we want graphs for a mixed network?

@Stebalien
Copy link
Member

Let's stick with a simple "old DHT" versus "new DHT" graph. Depending on the network load and hydra performance, network performance for "new" nodes may increase or may decrease as the rest of the network upgrades.

Case 1: Increase: Hydra is not powerful enough to easily handle requests on the network. As more nodes join the network, they share the load.

Case 2: Decrease: Hydra is powerful enough to handle all requests on the network. As more nodes join the network, they'll likely perform worse than the hydra nodes, slowing down requests.

@momack2
Copy link
Contributor Author

momack2 commented Apr 18, 2020

My bad - yes, fine to just look at a totally upgraded network (I was staring at your previous graph that also had metrics for cyprus searching balsa when I asked)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants