Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(pubsub): add open telemetry trace support #5034

Closed
wants to merge 22 commits into from

Conversation

hongalex
Copy link
Member

This PR adds the foundation for tracing of a publisher and subscriber for Pub/Sub. This PR attempts to mimic existing messaging system tracing and follows the semantic conventions defined here. This is a draft and is subject to backwards incompatible changes.

Here's a sample that demonstrates using this otel enabled library w/ Google Cloud Tracing as the exporter. The output looks something like the following:
image

@hongalex hongalex requested review from a team as code owners October 26, 2021 23:49
@product-auto-label product-auto-label bot added the api: pubsub Issues related to the Pub/Sub API. label Oct 26, 2021
@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Oct 26, 2021
pubsub/topic.go Outdated Show resolved Hide resolved
// Calculate the size of the encoded proto message by accounting
// for the length of an individual PubSubMessage and Data/Attributes field.
msgSize := proto.Size(&pb.PubsubMessage{
Data: msg.Data,
Attributes: msg.Attributes,
OrderingKey: msg.OrderingKey,
})
span.SetAttributes(semconv.MessagingMessagePayloadSizeBytesKey.Int(msgSize))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this already set in getSpanAttributes on line 532, above?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand: After injecting the span into the message attributes, the size of the message needs to be updated.

pubsub/trace.go Outdated Show resolved Hide resolved
@@ -697,6 +728,7 @@ func (t *Topic) publishMessageBundle(ctx context.Context, bms []*bundledMessage)
ipubsub.SetPublishResult(bm.res, "", err)
} else {
ipubsub.SetPublishResult(bm.res, res.MessageIds[i], nil)
bm.span.SetAttributes(semconv.MessagingMessageIDKey.String(res.MessageIds[i]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the "publish RPC" span also need to be updated with the message ID?

@product-auto-label product-auto-label bot added the stale: extraold Pull request is critically old and needs prioritization. label Jan 7, 2022
@product-auto-label product-auto-label bot added the size: s Pull request size is small. label Jan 8, 2022
@hongalex hongalex requested review from a team and tritone as code owners February 22, 2022 23:49
@hongalex hongalex added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Mar 2, 2022
@kokoro-team kokoro-team removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Mar 2, 2022
@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Mar 3, 2022
@product-auto-label product-auto-label bot added size: s Pull request size is small. and removed size: m Pull request size is medium. labels Mar 3, 2022
@google-cla google-cla bot removed the cla: no This human has *not* signed the Contributor License Agreement. label Mar 3, 2022
@hongalex hongalex requested review from a team as code owners March 30, 2022 22:39
@product-auto-label product-auto-label bot added size: xl Pull request size is extra large. and removed size: s Pull request size is small. labels Mar 30, 2022
@googleapis googleapis deleted a comment from snippet-bot bot Mar 30, 2022
@product-auto-label product-auto-label bot added size: l Pull request size is large. and removed size: xl Pull request size is extra large. labels Apr 13, 2022
// If this call fails (e.g. because the service account doesn't have
// the roles/viewer or roles/pubsub.viewer role) we will assume
// EnableMessageOrdering to be true.
// See: https://github.com/googleapis/google-cloud-go/issues/3884
func (s *Subscription) checkOrdering(ctx context.Context) {
func (s *Subscription) checkSubConfig() {
ctx := context.Background()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did this context switch to background?

cfg, err := s.Config(ctx)
if err != nil {
s.enableOrdering = true
} else {
s.enableOrdering = cfg.EnableMessageOrdering
s.topicName = cfg.Topic.name
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does not work when the gcp identity don't have roles/pubsub.viewer, it should fallback to the subscription name in the err != nil case, otherwise the span name will be " receive"

@@ -926,6 +941,9 @@ func (s *Subscription) Receive(ctx context.Context, f func(context.Context, *Mes
defer wg.Wait()
defer cancel2()
for {
opts := getSubSpanAttributes(s.topicName, &Message{}, semconv.MessagingOperationReceive)
ctx2, rs := s.tracer.Start(ctx2, fmt.Sprintf("%s receive", s.topicName), opts...)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this span can block for a long time if there is no message to be processed in the topic.
It can potentially cause the consumer span to be started before the message is published:

Process P:     | topic send |
--
Process C: |---sub receive---|---sub process---|

where both sub receive and sub process are children of topic send.

This is what it looks like in honeycomb (with no parent trace in the message)
image

PS: It also causes the span based metrics to be messed up.

semconv.MessagingDestinationKindTopic,
semconv.MessagingMessageIDKey.String(msg.ID),
semconv.MessagingMessagePayloadSizeBytesKey.Int(msgSize),
attribute.String(orderingAttribute, msg.OrderingKey),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

msg.OrderingKey is empty on unordered, maybe there should be a default when setting it in attributes.

// OrderingKey identifies related messages for which publish order should
// be respected. If empty string is used, message will be sent unordered.
OrderingKey string

@mbn18
Copy link

mbn18 commented May 31, 2022

Hey, is there any expectation to when this will be supported? thanks

@hongalex
Copy link
Member Author

Hey, is there any expectation to when this will be supported? thanks

Thanks for your patience. This was briefly paused to account for a different design decision for tracing. Initially, we were following the specifications on this otel messaging page on batch receiving which we think no longer makes sense for Pub/Sub users. We believe that it would be more useful for users to trace messages individually to better view the lifespan of a single message, which deviates slightly from otel semantic convention's approach of tracking multiple messages in a single trace. If you're interested in learning more, I'd be happy to share.

With that said, we're currently prioritizing other features at the moment, including exactly-once delivery and BigQuery subscriptions, which should be out in the next month or so. Afterwards, resuming work on tracing will be the highest priority item.

@mbn18
Copy link

mbn18 commented Jun 1, 2022

@hongalex , thanks for letting me know.

For the mean while I took your propagator design and implemented it in my code. Will add it here so other can use it till a final solution exist

package whatever

import (
	"cloud.google.com/go/pubsub"
	"context"
	"go.opentelemetry.io/otel"
)

const pubsubAttribPrefix = "whatever"

// PubsubMessageCarrier injects and extracts traces from a pubsub.Message.
type PubsubMessageCarrier struct {
	msg *pubsub.Message
}

// NewPubsubMessageCarrier creates a new PubsubMessageCarrier.
func NewPubsubMessageCarrier(msg *pubsub.Message) PubsubMessageCarrier {
	return PubsubMessageCarrier{msg: msg}
}

// Get retrieves a single value for a given key.
func (c PubsubMessageCarrier) Get(key string) string {
	return c.msg.Attributes[pubsubAttribPrefix+"_"+key]
}

// Set sets an attribute.
func (c PubsubMessageCarrier) Set(key, val string) {
	c.msg.Attributes[pubsubAttribPrefix+"_"+key] = val
}

// Keys returns a slice of all keys in the carrier.
func (c PubsubMessageCarrier) Keys() []string {
	i := 0
	out := make([]string, len(c.msg.Attributes))
	for k := range c.msg.Attributes {
		out[i] = k
		i++
	}
	return out
}

func PubSubMessageInjectContext(ctx context.Context, msg *pubsub.Message) {
	otel.GetTextMapPropagator().Inject(ctx, NewPubsubMessageCarrier(msg))
}

func PubSubMessageExtractContext(ctx context.Context, msg *pubsub.Message) context.Context {
	return otel.GetTextMapPropagator().Extract(ctx, NewPubsubMessageCarrier(msg))
}

*** Note, this does not provide attributes that need to be added to the designated span. like semconv.MessagingDestinationKindTopic

@hongalex hongalex deleted the branch googleapis:pubsub-otel-beta June 15, 2022 23:51
@hongalex hongalex closed this Jun 15, 2022
@Mistic92
Copy link

Hi, what is the current status?

@panperla
Copy link

panperla commented Mar 1, 2023

Are there any alternative thread? Or it just died and nobody is looking into it?

@meredithslota
Copy link
Contributor

We are still tracking this here: #4665

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: pubsub Issues related to the Pub/Sub API. cla: yes This human has signed the Contributor License Agreement. size: l Pull request size is large. stale: extraold Pull request is critically old and needs prioritization.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants