Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: bunny instrumentation #566

Merged
merged 12 commits into from
Apr 30, 2021

Conversation

johanneswuerbach
Copy link
Contributor

Basic instrumentation for bunny based on the existing ruby_kafka instrumentation.

The instrumentation provides:

  • context propagation through rabbitmq stored in the message header
  • span generation for publishing, manual consumption (pop) and consumption using a consumer (subscibe)

This basic instrumentation should also be sufficient when using higher level libraries based on bunny like https://github.com/jondot/sneakers

@johanneswuerbach
Copy link
Contributor Author

Thanks for the feedback, I pushed a round of fixes which should address all your comments.

Base automatically changed from master to main January 28, 2021 22:57
@johanneswuerbach johanneswuerbach force-pushed the bunny-instrumentation branch 2 times, most recently from 8d88c3c to 7c74a3d Compare February 3, 2021 00:47
@johanneswuerbach
Copy link
Contributor Author

johanneswuerbach commented Feb 3, 2021

@fbogsany rebased and adopted to latest main changes. Let me know if you have any other comments or concerns.

Copy link
Member

@mwear mwear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Did you want to take a final look @fbogsany or @robertlaurin?

Copy link
Contributor

@robertlaurin robertlaurin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a couple questions.

instrumentation/bunny/example/bunny.rb Outdated Show resolved Hide resolved
OpenTelemetry.propagation.inject(properties[:headers])
end

super
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the above, it looks like a uncaught_exception_handler could be provided that raises the exception, in which case we would not capture that event here either.

Not being familiar with rabbitmq, is this call to super intentionally not being wrapped with the trace block for any particular reason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super only enqueues the message into the internal worker pool in this case and the actual consumer call is wrapped into a process span, which should handle the exception in this case.

Comment on lines 32 to 57
# This method is called when rabbitmq pushes messages to subscribed consumers
def handle_frameset(basic_deliver, properties, content)
OpenTelemetry::Instrumentation::Bunny::PatchHelpers.with_receive_span(self, tracer, basic_deliver, properties) do
properties[:headers] ||= {}
OpenTelemetry.propagation.inject(properties[:headers])
end

super
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks to me like this method wraps consumers, so we can wrap a process span around it:

Suggested change
# This method is called when rabbitmq pushes messages to subscribed consumers
def handle_frameset(basic_deliver, properties, content)
OpenTelemetry::Instrumentation::Bunny::PatchHelpers.with_receive_span(self, tracer, basic_deliver, properties) do
properties[:headers] ||= {}
OpenTelemetry.propagation.inject(properties[:headers])
end
super
end
# This method is called when rabbitmq pushes messages to subscribed consumers
def handle_frameset(basic_deliver, properties, content)
OpenTelemetry::Instrumentation::Bunny::PatchHelpers.with_process_span(self, tracer, basic_deliver, properties) do
super
end
end

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is interesting because you're also wrapping the Consumer.call. Maybe we should just not wrap handle_frameset? We don't have to have a receive span in every case - the process span is much more interesting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The handle_frameset submits the message to a worker pool, so there is a time between receiving the message and eventually processing it in a consumer thread.

Should that time be part of process or receive or not be tracked at all?

module Bunny
module Patches
# The Queue module contains the instrumentation patch the Queue#pop method.
module Queue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is Queue typically used directly by clients or is it only used internally? It feels like we're going to end up with nested process spans in some cases, but I may be misunderstanding the gem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johanneswuerbach
Copy link
Contributor Author

johanneswuerbach commented Mar 29, 2021

Thank you for the all the feedback, I made a couple of changes to be more spec compliant around receive vs. process.

Now the push & pull api are modelled as https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/messaging.md#batch-receiving.

basic_get, used by Queue.pop starts a receive span when the receiving is started and in queue pop we link that span to the process span if a block is defined.

handle_frameset is doing the same before submitting the message into internal consumer pool, where it will eventually be processed by a consumer. Technically the receive should already wrap https://github.com/ruby-amqp/bunny/blob/0b5a6f2778f64f3c4bdf8b3063c6bdcbecfc8123/lib/bunny/reader_loop.rb#L77, but I wasn't sure how to integrate there so receive now only starts once the message was read from the socket. Resolved.

@johanneswuerbach
Copy link
Contributor Author

@fbogsany could I get another review? :-)

@fbogsany
Copy link
Contributor

🤦 sorry! 👀 now.

@johanneswuerbach
Copy link
Contributor Author

@fbogsany friendly ping :-)

@fbogsany
Copy link
Contributor

🤦 👀

@fbogsany fbogsany merged commit 38ff1ea into open-telemetry:main Apr 30, 2021
@fbogsany
Copy link
Contributor

Thanks @johanneswuerbach - sorry this took so long to review and merge 😞

@johanneswuerbach
Copy link
Contributor Author

No worries, we are all busy people. Thank you for working on this 🙇

parent_context = OpenTelemetry.propagation.extract(properties[:tracer_receive_headers])

# link to the producer context
producer_context = OpenTelemetry.propagation.extract(properties[:headers])
Copy link
Contributor

@indrekj indrekj Aug 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johanneswuerbach shouldn't this be .extract(properties[:headers] || {})?

If I have two services A & B. The first service is instrumented but the second one is not, then I get this exception in service A:

Uncaught exception from consumer #<Bunny::Consumer:70980 @channel_id=2 @queue=amq.gen-p2ZzQp6h4r6QWdCBIC4Pgg> @consumer_tag=bunny-1629126993000-738472400850>: #<NoMethodError: undefined method `[]' for nil:NilClass> @ /home/indrek/.rvm/gems/ruby-2.7.2@admin/gems/opentelemetry-api-1.0.0.rc2/lib/opentelemetry/context/propagation/text_map_getter.rb:16:in `get'

I think that's because properties[:headers] is nil when receiving a message from a service that is not using OTEL.

EDIT: I pinged the wrong person initially.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that seems possible. Could you create a separate issue for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants