New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Clarify package demarcation between sdk/trace and sdk/export/trace #1380

Closed

3 tasks done

johananl opened this issue Nov 30, 2020 · 11 comments · Fixed by #1873

Closed

3 tasks done

Clarify package demarcation between sdk/trace and sdk/export/trace #1380

johananl opened this issue Nov 30, 2020 · 11 comments · Fixed by #1873

Assignees

Labels

area:trace pkg:SDK

Milestone

Contributor

johananl commented Nov 30, 2020 •

edited

Loading

General

In the discussion here we consider doing a few refactoring tasks around sdk/trace and sdk/export/trace (for brevity I'll refer to them as trace and export, respectively):

Move Event from export to trace.
Move SpanSnapshot (currently named SpanData in master) from export to trace.
Modify the ExportSpans method of the SpanExporter interface to accept a ReadOnlySpan (introduced in Add RO/RW span interfaces #1360).

These suggestions make a lot of sense to me. However, while looking into them I realized we need to better clarify the demarcation and relationship of trace and export, and possibly clarify the purpose of export (not to be confused with exporters) better.

The problem

trace currently depends on export in multiple places:

opentelemetry-go/sdk/trace/batch_span_processor.go

Line 25 in 55ff277

export "go.opentelemetry.io/otel/sdk/export/trace"

opentelemetry-go/sdk/trace/provider.go

Line 25 in 55ff277

export "go.opentelemetry.io/otel/sdk/export/trace"

opentelemetry-go/sdk/trace/simple_span_processor.go

Line 21 in 55ff277

export "go.opentelemetry.io/otel/sdk/export/trace"

opentelemetry-go/sdk/trace/span.go

Line 29 in 55ff277

export "go.opentelemetry.io/otel/sdk/export/trace"

opentelemetry-go/sdk/trace/span_processor.go

Line 21 in 55ff277

export "go.opentelemetry.io/otel/sdk/export/trace"

We can't do the refactoring tasks above as doing so creates import cycles. After moving Event and SpanSnapshot to trace and changing ExportSpans to accept a ReadOnlySpan, we end up with the following import cycle:

export depends on trace because ExportSpans accepts a ReadOnlySpan.
trace depends on export in multiple places as shown above.

Possible solutions

AFAICT we can do one of the following to address the problem:

Option 1: Ensure export doesn't depend on trace.
Option 2: Ensure trace doesn't depend on export.

Looks like option 1 isn't feasible if we want to have ExportSpans accept a ReadOnlySpan, assuming we want to keep the SpanExporter interface under export. This seems to leave us with option 1.

Option 2 seems feasible. Roughly speaking, here is what we would have to do to implement this approach:

Move the SpanProcessor implementations under export, while keeping the interface under trace.
Remove the WithSyncer and WithBatcher convenience functions from trace since they accept an export.SpanExporter and we don't want trace to depend on export. Instead, callers use WithSpanProcessor directly.
Refactor all relevant tests to use the new package structure.

Things to consider

Having export depend on trace follows the dependency inversion principle. Following this principle, a "high level" package doesn't depend on a "low level" package and the wiring between the packages is done using interfaces rather than concrete types. I think we already follow this approach with regards to the API/SDK relationship: The SDK (low level) depends on the API (high level) but not the other way around. Following this, perhaps it makes sense to keep using this pattern inside the SDK: trace is arguably a higher level package than export, which means export depends on trace and the SpanProcessor interface is defined in trace and implemented in export and serves as the "contract" between the two packages.

Other libraries (Java, Python) seem to be following option 2 more or less: The equivalent of export depends on the equivalent of trace but not vice versa. In the Java library, export is even a child package of trace.

We might want to remain consistent with the metrics-related packages in the Go library and use a similar pattern/hierarchy for both metrics and tracing. Unfortunately, right now sdk/metric depends on sdk/export/metric 😢

Lastly, we might want to think about whether #536 is related to the problem discussed here.

TODO

Once we've settled on an approach, we should open follow up issues for the following:

Move Event from export to trace.
Move SpanSnapshot (currently named SpanData in master) from export to trace.
Modify the ExportSpans method of the SpanExporter interface to accept a ReadOnlySpan (introduced in Add RO/RW span interfaces #1360).

The text was updated successfully, but these errors were encountered:

Member

Aneurysm9 commented Nov 30, 2020

Option 2 seems feasible. Roughly speaking, here is what we would have to do to implement this approach:

Move the SpanProcessor implementations under export, while keeping the interface under trace.

Remove the WithSyncer and WithBatcher convenience functions from trace since they accept an export.SpanExporter and we don't want trace to depend on export. Instead, callers use WithSpanProcessor directly.

Refactor all relevant tests to use the new package structure.

This makes sense to me. I wondered for a bit whether the SpanProcessor implementations should live somewhere other than sdk/export/trace since processors are definitely useful for more than just exporting trace data, but these two implementations are trace export-specific.

Contributor

MrAlias commented Nov 30, 2020

I would propose we just get rid of the sdk/export/trace package and move the SpanExporter (and other types) into the sdk/trace package. This dependency cycle seems like a strong indicator of the fact that these packages share the same functional responsibilities. Additionally, the fact that the export package contains (currently) only three types and no function code implies that this package is superfluous.

The only benefit that I can see for having this in a separate package is that it makes it clear to the user who is going to build a SpanExporter where to find the interface. However, this does not seem like a good justification to keep this package, and it is something we can address with good documentation.

This approach would imply that we would also do this for the sdk/export/metric package as well. I think this would be appropriate for the same reason as the trace package, those packages share functional responsibilities.

This approach would also have the benefit of addressing #1014

@Aneurysm9 @jmacd and all other @open-telemetry/go-approvers I'm interested in your opinions on this.

johananl mentioned this issue

Add RO/RW span interfaces #1360

Merged

Member

krnowak commented Nov 30, 2020

I had the same thoughts about merging the packages, but I'm bit uneasy about doing the same for export/metric - it's considerably more code compared to just 3 types in in export/trace. As such, we need to put more thought into how the merging should be done. Likely blindly putting all the types into sdk/metric will result in an unwieldy package, that will require another splitting, just like with the toplevel package before.

Contributor Author

johananl commented Dec 1, 2020

Getting rid of export/trace is easy enough. I've pushed a WIP commit which does this refactoring here. This branch is build on top of #1360.

I wonder if this is the right approach though. To me it makes sense that the package structure for traces and metrics be similar.

Contributor

MrAlias commented Dec 1, 2020

As such, we need to put more thought into how the merging should be done.

Agreed. Looking closer at the export/metric package I think it suffers from the similar tight coupling of the trace package. These are the exported types from the package (not including the sub-package):

Accumulation
Aggregator
AggregatorSelector
CheckpointSet
Checkpointer
ExportKind
ExportKindSelector
Exporter
Metadata (part of the Accumulation and Record)
Processor
Record
Subtractor

The metric SDK itself depends on the:

Processor
Aggregator
Accumulation

The sdk/metric/aggregator package depends on the:

Aggregator

The sdk/metric/controller package depends on the:

Checkpointer
CheckpointSet
ExportKindSelector
Exporter
Record

The sdk/metric/processor package depends on the:

ExportKindSelector
AggregationSelector
Aggregator
Accumulation
Subtractor
Record
Processor
Checkpointer
CheckpointSet

This all points to the fact that the metrics export package is likely in just as much of a need to be integrated with the sdk metrics package as the trace package. I think it would be worth going into detail about what would be the best way to integrate the package in a separate issue. I can create one if there is agreement that this integration is a direction we want to go in.

Contributor

MrAlias commented Dec 1, 2020

Getting rid of export/trace is easy enough. I've pushed a WIP commit which does this refactoring here.

This looks like a good simplification to me. I wouldn't export the SpanSnapshot and instead return a ReadOnlySpan interface from the Snapshot method, but we can talk more about that if that evolves into a full PR.

Contributor Author

johananl commented Dec 2, 2020

I wouldn't export the SpanSnapshot and instead return a ReadOnlySpan interface from the Snapshot method

I've pushed another commit to my WIP branch in which I've started to work in this direction. This isn't done though as we've relied heavily on the old SpanSnapshot data structure in tests, all of which likely need to be modified now.

Here are the changes I've started implementing:

SpanSnapshot becomes spanSnapshot. This is now an unexported type which implements the ReadOnlySpan interface without locking, which means this type isn't thread safe but should have better read performance compared to span.
ReadOnlySpan is used throughout the entire span processing pipeline and also in exporter implementations. The point where we switch from a "live" span to a snapshot is inside span.End(): We call snapshot() which returns a ReadOnlySpan which has a spanSnapshot data structure underneath.

We should check if we still need to maintain "readability" before we snapshot a span. If we don't, maybe span doesn't even need to implement ReadOnlySpan.

Contributor

MrAlias commented Dec 2, 2020

SpanSnapshot becomes spanSnapshot. This is now an unexported type which implements the ReadOnlySpan interface without locking, which means this type isn't thread safe but should have better read performance compared to span.

ReadOnlySpan is used throughout the entire span processing pipeline and also in exporter implementations. The point where we switch from a "live" span to a snapshot is inside span.End(): We call snapshot() which returns a ReadOnlySpan which has a spanSnapshot data structure underneath.

👍

We should check if we still need to maintain "readability" before we snapshot a span. If we don't, maybe span doesn't even need to implement ReadOnlySpan.

I think we need to maintain this if it is going to be passed to SpanProcessors as a ReadWriteSpan (which embeds the ReadOnlySpan interface), right?

Contributor Author

johananl commented Dec 3, 2020 •

edited

Loading

Latest status on my WIP:

A lot of tests are currently failing and have to be refactored following the changes to spanSnapshot. The old SpanData type has been used extensively throughout the codebase for testing purposes. These usages need to be replaced with ReadOnlySpan or something else now.

One tricky bit to figure out is how to deal with auto-generated things like span IDs when running deep comparisons in tests: Until now we've used SpanData which was conveniently exported and allowed direct field access. Now that we moved to ReadOnlySpan, by definition we can no longer do things like "clearing" the span ID before comparing as was done here.

NOTE: My WIP branch is a follow up on #1360 and uses that PR as a base, not master.

Things left to do (possibly not exhaustive):

Scan the codebase and verify old references (in docs etc.) to SpanSnapshot or SpanData are replaced.
Ensure we have clear documentation for the SDK's "core" concepts, especially around the span and spanSnapshot types and the ReadOnlySpan interface. It should be clear for contributors what should and shouldn't be done with the various types and interfaces.
Add tests for ensuring correct behavior of ReadOnlySpan and spanSnapshot`.
Rebase the feature branch over master once Add RO/RW span interfaces #1360 is merged.

MrAlias added area:trace pkg:SDK priority:p1 labels

MrAlias added this to the RC1 milestone

MrAlias self-assigned this

MrAlias mentioned this issue

Move Span Event to API #1452

Merged

MrAlias mentioned this issue

Too many "trace" packages #1014

Closed

This was referenced Mar 26, 2021

Rename leftover references to SpanData #1736

Closed

Update SpanSnapshot to use parent SpanContext #1748

Merged

This was referenced Apr 6, 2021

Merge sdk/export/trace into sdk/trace #1778

Merged

Convenience functions NewExportPipeline and InstallNewPipeline can not be flushed. #1725

Closed

Contributor

MrAlias commented Apr 12, 2021

Need to resolve #1799 prior to moving to new ExportSpans interface that passes a ReadOnlySpan interface (the bundler cannot initialize a slice of an interface as it needs an actual value to initialize from).

This also makes me think we need to take a second look at if we want the ReadOnlySpan to be an interface, or if we can get away with it being an actual struct.

Contributor

seh commented Apr 12, 2021 •

edited

Loading

It doesn't look like the question came up for review in #1360. Was it an attempt to share storage, where the interface is just one view or lens over a struct (trace/span) that's used for the mutable view as well?

This was referenced Apr 26, 2021

Move the Event type from the API to the SDK #1846

Merged

Redefine ExportSpans of SpanExporter with ReadOnlySpan #1873

Merged

Add Status type to SDK #1874

Merged

MrAlias closed this as completed in #1873

jmacd mentioned this issue

Remove the sdk/export/metric package #2273

Closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment