Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenTelemetry Tracing Request Manager #271

Closed
hannahhoward opened this issue Nov 16, 2021 · 1 comment · Fixed by #283
Closed

OpenTelemetry Tracing Request Manager #271

hannahhoward opened this issue Nov 16, 2021 · 1 comment · Fixed by #283
Assignees
Labels

Comments

@hannahhoward
Copy link
Collaborator

hannahhoward commented Nov 16, 2021

What

We want to start adding basic tracing into graphsync. The request manager is simpler cause each request starts with a context. Our main challenge is setting up tracing across go routines.

How open telemetry works

OpenTelemetry is an API for tracer that can be backed by one or more actual libraries to export data to a tracing service. Fortunately, in Graphsync, we are NOT concerned with configuring data export -- that is left to the consumer of go-graphsync. However, we would like, if someone sets up a program with open telemetry tracing configured, for graphsync to provide useful trace information within the context whatever tracing the calling program has setup.

You can find a basic OpenTelemtry example here

We'll tackle the RequestManager first.

Requirements

First, we need to create a tracer at the top level in the main implementation at start time and pass it down to the request manager.

The call we use is documented here -- we can simply name our instrumentation "graphsync".

Now, we want to properly instrument a request.

@hannahhoward hannahhoward added the need/triage Needs initial labeling and prioritization label Nov 16, 2021
@hannahhoward
Copy link
Collaborator Author

Note: this is a basic setup just to insure we have two traces inside and outside the go-routine. We will likely add more tracing to the request manager later

@rvagg rvagg added status/in-progress In progress and removed need/triage Needs initial labeling and prioritization labels Nov 23, 2021
rvagg added a commit that referenced this issue Nov 25, 2021
rvagg added a commit that referenced this issue Nov 26, 2021
rvagg added a commit that referenced this issue Nov 26, 2021
rvagg added a commit that referenced this issue Nov 30, 2021
rvagg added a commit that referenced this issue Nov 30, 2021
rvagg added a commit that referenced this issue Nov 30, 2021
rvagg added a commit that referenced this issue Nov 30, 2021
rvagg added a commit that referenced this issue Nov 30, 2021
rvagg added a commit that referenced this issue Nov 30, 2021
rvagg added a commit that referenced this issue Nov 30, 2021
@rvagg rvagg closed this as completed in #283 Dec 1, 2021
rvagg added a commit that referenced this issue Dec 1, 2021
hannahhoward pushed a commit that referenced this issue Dec 9, 2021
feat: add WorkerTaskQueue#WaitForNoActiveTasks() for tests (#284)

* feat: add WorkerTaskQueue#WaitForNoActiveTasks() for tests

* fixup! feat: add WorkerTaskQueue#WaitForNoActiveTasks() for tests

fix(responsemanager): fix flaky tests

fix(responsemanager): make fix more global

feat: add basic OT tracing for incoming requests

Closes: #271

docs(tests): document tracing test helper utilities

fix(test): increase 1s timeouts to 2s for slow CI (#289)

* fix(test): increase 1s timeouts to 2s for slow CI

* fixup! fix(test): increase 1s timeouts to 2s for slow CI

testutil/chaintypes: simplify maintenance of codegen (#294)

"go generate" now updates the generated code for us.

The separate directory for a main package was unnecessary;
a build-tag-ignored file is enough.

Using gofmt on the resulting source is now unnecessary too,
as upstream has been using go/format on its output for some time.

Finally, re-generate the output source code,
as the last time that was done we were on an older ipld-prime.

ipldutil: use chooser APIs from dagpb and basicnode (#292)

Saves us a bit of extra code, since they were added in summer.
Also avoid making defaultVisitor a variable,
which makes it clearer that it's never a nil func.

While here, replace node/basic with node/basicnode,
as the former has been deprecated in favor of the latter.

Co-authored-by: Hannah Howard <[email protected]>

fix: use sync.Cond to handle no-task blocking wait (#299)

Ref: #284

Peer Stats function (#298)

* feat(graphsync): add impl method for peer stats

add method that gets current request states by request ID for a given peer

* fix(requestmanager): fix tested method

Add a bit of logging (#301)

* chore(responsemanager): add a bit of logging

* fix(responsemanager): remove code change

chore: short-circuit unnecessary message processing

Expose task queue diagnostics (#302)

* feat(impl): expose task queue diagnostics

* refactor(peerstate): put peerstate in its own module

* refactor(peerstate): make diagnostics return array
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants