Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: instrument with tracing #15

Merged
merged 3 commits into from
Sep 21, 2020
Merged

feat: instrument with tracing #15

merged 3 commits into from
Sep 21, 2020

Conversation

iand
Copy link
Contributor

@iand iand commented Sep 18, 2020

Adds tracing to primary indexer and processor functions, all lotus api calls and database persistence operations.

@iand iand marked this pull request as ready for review September 18, 2020 13:33
@iand iand requested a review from frrist September 18, 2020 13:33
Copy link
Contributor

@placer14 placer14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit. Looks good otherwise!

}
}
return nil
return bss.PersistWithTx(ctx, tx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice cleanup!


volumes:
timescaledb:
grafana-data:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you intend to include the grafana volume here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. Fixed

Copy link
Member

@frrist frrist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Good stuff. We are in urgent need of telemetry so I'd like to have this in asap since we can starting gathering metrics with it.

I'd also like to have some more discussion around the comments I've left. If the result of that discussion results in things here needing to change I am happy to take that as a follow on.

@@ -23,3 +30,87 @@ type APIWrapper struct {
func (aw *APIWrapper) Store() adt.Store {
return aw.store
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple thoughts on the change: this is definitely an improvement, but I have some concerns --- some of which you already mentioned:

  • Are we going to grow this as the api consumed by Visor grows? Or is there some alternative?
  • Since we are wrapping the API would it be better to have it as a field instead of embedding?
  • Would it be more appropriate to add API tracing to lotus instead? (as a follow on later)
  • Should we include the arguments to these calls as tags on the Span?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to grow this if we want complete tracing coverage (we may not) or we should attempt to get it into lotus instead. I'm not convinced we want every method covered, especially since we are concerned about performance.

Wrapping the api, making it unexported would help enforce some type safety.

We could include the arguments as tags. I don't know useful it would be in general so I avoided paying the marshalling cost of things like cids in this pass.

@@ -6,6 +6,33 @@ A component of [**Sentinel**](https://github.com/filecoin-project/sentinel), a c
A **Visor** process collects _permanent_ Filecoin chain meterics from a [**Lotus**](https://github.com/filecoin-project/lotus/) daemon, and writes them to a [**TimescaleDB**](https://github.com/timescale/timescaledb) time-series and relational datastore.


## Getting Started
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ 🙏

trace "go.opentelemetry.io/otel/api/trace"
"github.com/opentracing/opentracing-go"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ORM we are using supports tracing out-of-the-box: https://pg.uptrace.dev/tracing/. Are there any advantages to using opencensus over opentelemetry?

@lanzafame
Copy link
Contributor

Could we make use of opencensus tracing instead of opentracing? We make use of opencensus for both metrics and tracing in lotus and it would be ideal use just the one tool.

@iand
Copy link
Contributor Author

iand commented Sep 19, 2020

Could we make use of opencensus tracing instead of opentracing? We make use of opencensus for both metrics and tracing in lotus and it would be ideal use just the one tool.

Yep, no problem. I'll rework

@iand
Copy link
Contributor Author

iand commented Sep 19, 2020

OpenTracing and OpenCensus merged to start OpenTelemetry which is what the pg package supports, so I'll rework for that. There are software bridges to OpenCensus if needed.

@iand
Copy link
Contributor Author

iand commented Sep 19, 2020

PTAL

Switched to using opentelemetry. It has a more awkward API (3 imports needed to add one tag to a span) but it operates very closely to the opentracing code so it was mostly mechanical changes.

Bonus screenshot of a trace showing go-pg traces as well as visor ones

Screenshot from 2020-09-19 13-36-36

@iand iand merged commit 10148b0 into master Sep 21, 2020
@iand iand deleted the iand/tracing branch September 21, 2020 09:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants