Internal Graphsync Benchmarking Plan #131

hannahhoward · 2020-12-11T04:49:28Z

This issue documents our initial goals for profiling go-graphsync performance with TestGround (https://github.com/testground/testground)

Goals

Assess Graphsync performance in a variety of scenarios ON ITS OWN. We want to identify performance issues in the Graphsync library itself, including memory leaks, performance bottlenecks, etc, when it's transferring various types of IPLD data under various network conditions.

In scope

We are assessing graphsync's performance with a minimal number of external dependencies: a libp2p stack, and a data store.

Out of scope

We are not attempting to test graphsync's performance with various other components in the IPFS or Filecoin stack (i.e. Bitswap, DHT, unusual blockstore setup situations, etc)

Prior art

filecoin graphsync testground test: https://github.com/filecoin-project/lotus/blob/master/testplans/graphsync/main.go -- this is actually the ideal base to build from. I think it should move to this repo, and to the extent it stays in Lotus, it should be used to test graphsync in concert with the overall lotus system, not graphsync isolated with Libp2p hosts as it is here
existing graphsync benchmarks using libp2p mocknet: https://github.com/ipfs/go-graphsync/tree/master/benchmarks

Tasks

Move/Copy filecoin testground test to this repo
Add parameterization around unix FS file creation -- see what we used in

go-graphsync/benchmarks/benchmark_test.go

Line 294 in 4372a80

func allFilesUniformSize(size uint64, unixfsChunkSize uint64, unixfsLinksPerLevel int, useRawNodes bool) distFunc {

- specifically include raw vs non-raw nodes, configurable chunk size
Add option for disk based datastore (memory performance tracking is meaningless otherwise) - we utilize temp folders

go-graphsync/benchmarks/benchmark_test.go

Line 318 in 4372a80

func newTempDirMaker(b *testing.B) (*tempDirMaker, error) {

+ badger -

go-graphsync/benchmarks/testinstance/testinstance.go

Line 151 in 4372a80

defopts := badgerds.DefaultOptions

- in a dockerized environment we can probably just write someone on the running disk and don't need to worry about temp files as the volumes get reset when shutdown
We can use testground heap profiling, but we may want to actually trigger heap profiles at specific points. Not sure if testground has a facility for this? We can certainly dump a profile with go (https://golang.org/pkg/runtime/pprof/#WriteHeapProfile) but I'm not sure if TestGround has a facility for moving it out of the docker container and onto the main disk? cc: @nonsense
We need to heap profile on the responder side as well as the requestor -- this may entail keeping the responder alive -- my read of the current code for filecoins testplan is that the routine on the responder side ends early
We also probably want to try this dump with and without an explicit GC before hand (i.e. runtime.GC())
For accurate memory statistics, we should modify the response consumption on the requestor to consume the response channel as well as the error channel https://github.com/filecoin-project/lotus/blob/master/testplans/graphsync/main.go#L208 (https://github.com/ipfs/go-graphsync/blob/master/benchmarks/benchmark_test.go#L171 for example of correctly consuming response channel)
We may want to add testing of disconnects of various kinds
We need to integrate with testground as a service in our CI

This issue is an epic tracker. We can submit this over several PRs.

welcome · 2020-12-11T04:49:30Z

Thank you for submitting your first issue to this repository! A maintainer will be here shortly to triage and review.
In the meantime, please double-check that you have provided all the necessary information to make this process easy! Any information that can help save additional round trips is useful! We currently aim to give initial feedback within two business days. If this does not happen, feel free to leave a comment.
Please keep an eye on how this issue will be labeled, as labels give an overview of priorities, assignments and additional actions requested by the maintainers:

"Priority" labels will show how urgent this is for the team.
"Status" labels will show if this is ready to be worked on, blocked, or in progress.
"Need" labels will indicate if additional input or analysis is required.

Finally, remember to use https://discuss.ipfs.io if you just need general support.

* feat: add logging to push channel monitor * feat: add log line to push channel monitor

hannahhoward added the need/triage Needs initial labeling and prioritization label Dec 11, 2020

acruikshank mentioned this issue Dec 14, 2020

testground test for graphsync #132

Merged

hannahhoward closed this as completed Sep 28, 2021

marten-seemann pushed a commit that referenced this issue Mar 2, 2023

add logging to push channel monitor (#131)

4998765

* feat: add logging to push channel monitor * feat: add log line to push channel monitor

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Internal Graphsync Benchmarking Plan #131

Internal Graphsync Benchmarking Plan #131

hannahhoward commented Dec 11, 2020 •

edited

Loading

welcome bot commented Dec 11, 2020

Internal Graphsync Benchmarking Plan #131

Internal Graphsync Benchmarking Plan #131

Comments

hannahhoward commented Dec 11, 2020 • edited Loading

Goals

In scope

Out of scope

Prior art

Tasks

welcome bot commented Dec 11, 2020

hannahhoward commented Dec 11, 2020 •

edited

Loading