Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ext/jaeger: fix exporting to collector #508

Conversation

mauriciovasquezbernal
Copy link
Member

@mauriciovasquezbernal mauriciovasquezbernal commented Mar 18, 2020

Edit: I updated the PR to use THTTPClient directly

The exporting of traces to the collector is broken, it replies with the
"Unable to process request body: Required field Process is not set" error.

The current implementation is based on OpenCensus [1], what appears to be broken
too, it's not totally clear at this time what's wrong with that.

This commit changes the logic to avoid using jaeger.Client that appears to be
the broken piece, it is changed by a direct invokation to the THTTPClient
transport from thrift.

[1] https://github.com/census-instrumentation/opencensus-python/tree/master/contrib/opencensus-ext-jaeger

Fixes #493.

How to test:

Run the Jaeger collector, I used (notice there is not any port for the agent)

docker run -p 16686:16686 -p 14268:14268 jaegertracing/all-in-one

Export some traces to the collector:

# create a JaegerSpanExporter
jaeger_exporter = jaeger.JaegerSpanExporter(
    service_name="my-helloworld-service",
    # configure agent
    # agent_host_name="localhost",
    # agent_port=6831,
    # optional: configure also collector
    collector_host_name="localhost",
    collector_port=14268,
    collector_endpoint="/api/traces",
    # username="xxxx",  # optional
    # password="xxxx",  # optional
)

# create a BatchExportSpanProcessor and add the exporter to it
span_processor = BatchExportSpanProcessor(jaeger_exporter)

# add to the tracer factory
trace.get_tracer_provider().add_span_processor(span_processor)

# create some spans for testing
with tracer.start_as_current_span("foo") as foo:
    time.sleep(0.1)
    foo.set_attribute("my_atribbute", True)
    foo.add_event("event in foo", {"name": "foo1"})
    with tracer.start_as_current_span(
        "bar", links=[trace.Link(foo.get_context())]
    ) as bar:
        time.sleep(0.2)
        bar.set_attribute("speed", 100.0)

        with tracer.start_as_current_span("baz") as baz:
            time.sleep(0.3)
            baz.set_attribute("name", "mauricio")

        time.sleep(0.2)

    time.sleep(0.1)

Check that those traces where exported to Jaeger in http://localhost:16686.

The exporting of traces to the collector is broken, it replies with the
"Unable to process request body: Required field Process is not set" error.

The current implementation is based on OpenCensus [1], what appears to be broken
too, it's not totally clear at this time what's wrong with that.

This commit changes the exporting logic to be similar to the opentelemetry-go [2]
one that is working. The main change is to perform the request directly without
using the client provided by the generated files.

[1] https://github.com/census-instrumentation/opencensus-python/tree/master/contrib/opencensus-ext-jaeger
[2] https://github.com/open-telemetry/opentelemetry-go/blob/master/exporters/trace/jaeger/jaeger.go
transport = TTransport.TMemoryBuffer()
protocol = TBinaryProtocol.TBinaryProtocol(transport)
batch.write(protocol)
body = transport.getvalue()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it not possible to stream this to requests.post? With this, we have to hold the whole serialized blob in memory at once.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I studied the problem a little bit more and realized it is possible to directly use thrift.transport.THttpClient as transport of the TBinaryProtocol object, it means that the writes will be done directly in the buffer used to stream data avoiding this intermediate storage, it also avoid to have requests as a dependency here.

The exporting of traces to the collector is broken, it replies with the
"Unable to process request body: Required field Process is not set" error.

The current implementation is based on OpenCensus [1], what appears to be broken
too, it's not totally clear at this time what's wrong with that.

This commit changes the logic to avoid using jaeger.Client that appears to be
the broken piece, it is changed by a direct invokation to the THTTPClient
transport from thrift.
@codecov-io
Copy link

codecov-io commented Mar 18, 2020

Codecov Report

Merging #508 into master will increase coverage by 0.08%.
The diff coverage is 40.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #508      +/-   ##
==========================================
+ Coverage   89.48%   89.56%   +0.08%     
==========================================
  Files          43       43              
  Lines        2215     2213       -2     
  Branches      250      249       -1     
==========================================
  Hits         1982     1982              
+ Misses        161      159       -2     
  Partials       72       72              
Impacted Files Coverage Δ
...xt-jaeger/src/opentelemetry/ext/jaeger/__init__.py 87.57% <40.00%> (+1.02%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4e551ba...d456d09. Read the comment docs.

@toumorokoshi
Copy link
Member

Nice! Can you elaborate further on what the fix was? I'm not clear why submitBatches didn't work, but the batch.write did.

@mauriciovasquezbernal
Copy link
Member Author

Nice! Can you elaborate further on what the fix was? I'm not clear why submitBatches didn't work, but the batch.write did.

Neither am I.

The implementation of submitBatches [1] (automatically generated from thrift files) creates a message that the Jaeger collector rejects, I don't know if the thrift files we are using are wrong (I checked with https://github.com/jaegertracing/jaeger-idl/blob/master/thrift/jaeger.thrift and the problem is still there), I tested old versions of the Jaeger collector and the problem was the same. I tested the opencensus implementation (the otel jaeger exporter is based on it) and the problem is also there, so I looked at the implementation of go [2] and I found that I could avoid using sendBatches.

Once I had a working version I stopped looking for the root reason of the problem.

[1]

def send_submitBatches(self, batches):
self._oprot.writeMessageBegin('submitBatches', TMessageType.CALL, self._seqid)
args = submitBatches_args()
args.batches = batches
args.write(self._oprot)
self._oprot.writeMessageEnd()
self._oprot.trans.flush()

[2]
https://github.com/open-telemetry/opentelemetry-go/blob/a485d0ec64a48f6b7d5344790ba6c4b85f154e8e/exporters/trace/jaeger/uploader.go#L121

Copy link
Member

@toumorokoshi toumorokoshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the catch!

@c24t c24t added this to the 3/31 Beta milestone Mar 26, 2020
@mauriciovasquezbernal mauriciovasquezbernal added the needs reviewers PRs with this label are ready for review and needs people to review to move forward. label Mar 27, 2020
Copy link
Member

@c24t c24t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix @mauriciovasquezbernal, LGTM!

@toumorokoshi toumorokoshi merged commit 51cfe76 into open-telemetry:master Mar 28, 2020
@mauriciovasquezbernal mauriciovasquezbernal deleted the mauricio/fix-jaeger-collector-exporter branch April 14, 2020 21:50
srikanthccv pushed a commit to srikanthccv/opentelemetry-python that referenced this pull request Nov 1, 2020
* ci: install minimal lint & doc deps

* fix: lint
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exporters needs reviewers PRs with this label are ready for review and needs people to review to move forward.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

opentelemetry-ext-jaeger doesn't work with jaeger-all-in-one collector
5 participants