Skip to content
This repository has been archived by the owner on May 23, 2023. It is now read-only.

Implement ScopeManager for in-process propagation #63

Closed
wants to merge 8 commits into from

Conversation

palazzem
Copy link
Member

@palazzem palazzem commented Nov 14, 2017

Overview

This PR aims to solve in-process context propagation for Python. It has two aims:

  • adding a simple, high level API that allows very simple tracing that will handle 90% of use cases (start_active() and start_manual())
  • a pluggable interface (ScopeManager) that allows tracing async operations with all the common frameworks - threads, asyncio, tornado, etc

Implementation details are available in another PR (opentracing/basictracer-python#24).

Details

With this PR, Scope is implemented as a wrapper of Span class. No changes have been made to the current Span and SpanContext, making the propagation something built on top of current data models.

A Scope formalizes the activation and deactivation of a Span, that is handled using a ScopeManager available in the Tracer itself. It can be used automatically using tracing methods, or manually if you need to retrieve the current active Span. Simple examples are:

def main():
    # `scope` will be automatically closed and the `Span` finished
    with tracer.start_active('request') as scope:
        scope.span().set_tag('http.url', url)
        do_something()

def do_something():
    scope = tracer.scope_manager.active()
    scope.span().set_tag('http.status_code', '200')

Opened questions

  • While I agree that obfuscating a function call with a property is not a shared opinion, I'd rather prefer having verbs in our methods (so get_something/set_something instead of just something). In Python such approach is not really idiomatic, though properties are. Because of that I suggest to have some internal attributes "protected" via a property like:
# scope.py
scope = Scope(...)
scope.span  # instead of scope.span()
  • The previous Continuation (and so the counter) is not part of this API, though it could be a contrib module in case we "really" need it in the future.
  • No thread-local implementation since this is only the API. Details will be discussed in the reference implementation.

References

"""
return True

def is_parent(self, parent, span):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't know if it's really needed, but it would be great that tests available in the api_check.py tests also the parenting so that a ScopeManager can be tested automatically when a new Tracer implementation is created.

def __init__(self):
# TODO: `tracer` should not be None, but we don't have a reference;
# should we move the NOOP SpanContext, Span, Scope to somewhere
# else so that they're globally reachable?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean here is that maybe everywhere in the repo we can simply:

from opentracing.ext import noop  # or something similar

self._noop_span = noop.SPAN
self._noop_scope = noop.SCOPE

The noop module may be initialized when the Tracer is created (so in the __init__.py at the moment?) so we have references of Tracer and ScopeManager, avoiding creating every time a no-op of something.

Let me know if I'm missing something!

@@ -45,3 +49,46 @@ def test_baggage_check_works(self):
# second check that assert on empty baggage will fail too
with self.assertRaises(AssertionError):
api_check.test_context_baggage()

def test_scope_manager_check_works(self):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure we need that (other than coverage?). I've followed the approach above.

@palazzem
Copy link
Member Author

palazzem commented Nov 14, 2017

@carlosalberto @tedsuo it would be great having a pass on this PR as a follow-up of the Java API. Currently this API is good enough for us since it grants the flexibility we need for in-process propagation. I'm going to publish the reference implementation PR too, so I can update the PR description with the link.

Thanks in advance for spending time reviewing the PR!

EDIT: link to the reference implementation -> opentracing/basictracer-python#24

self._manager = manager
self._span = span

def span(self):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer this one as a @property so scope.span instead of scope.span() but it's something we can discuss.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say we keep a property as well - I know some people don't like it, but then again, we already have Span's context and tracer as properties ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally agree!

"""ScopeManager accessor"""
return self._scope_manager

def start_active(self,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nomenclature nitpick: i don't like the naming of start_active vs start_manual. start_active will be used 95% of the time, so i think it should just be called start and then there can be start_{whatever_you_choose} for the rest.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's an interesting point since the most common API may be the shortest possible, let's see what others are thinking!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@carlosalberto thoughts?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would vote to keep the api as close to the other language implementations as possible. On a similar note, should something about in process span management be added to the opentracing specification?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wkiser I agree on the uniformity point.

About the standard inclusion: I think it will be added eventually, after the implementation proves to be sufficient and tracer authors are fine with it (for reference: for the Java api it took a pair of iterations to get it right ;) )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, "manual" was removed from the Java API.

Even though start_active_span will be used more often, it's worth keeping the full name to be in sync with other languages and emphasize the side effects. It also doesn't return a Span the way start_span does.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also doesn't return a Span the way start_span does.

Ah - I overlooked that. Seems like it should beTracer.start_active_scope().

I've submitted a PR to @palazzem's branch with these naming updates (palazzem#1).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be start_active_span, since Span is the main thing that's being created here, it's just wrapped in a Scope for management.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion on _active_, so I'm happy to adjust it.

Here's the difference in practice:

with tracer.start_active_span('...') as scope:
                scope.span().set_tag('http.method', 'GET')
                do_some_work()

vs.

with tracer.start_active_scope('...') as scope:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing! I definitely think that start_manual* must be out of the Python API. Let's make this change.

:param finish_on_close: whether span should automatically be
finished when `Scope#close()` is called
"""
self._manager = manager
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are saving manager, why not expose it too (through a property)? Else, well, don't save it (and let implementations do on that ;) )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it may be better to not expose the manager since the source of truth for the ScopeManager should be the Tracer (right?). If that makes sense, I'd say to remove that line and let the implementation do that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though we need to keep it in the constructor signature since implementations need (in 99% of cases) that.

# TODO: `tracer` should not be None, but we don't have a reference;
# should we move the NOOP SpanContext, Span, Scope to somewhere
# else so that they're globally reachable?
self._noop_span = Span(tracer=None, context=SpanContext())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One idea: creating a Tracer and use its noop_span is an alternative, even if not a perfect one (as then we would end up with two noop Tracer objects).

@@ -38,33 +40,88 @@ class Tracer(object):
_supported_formats = [Format.TEXT_MAP, Format.BINARY, Format.HTTP_HEADERS]

def __init__(self):
self._scope_manager = ScopeManager()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not call it _noop_scope_manager?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct!

@carlosalberto
Copy link
Contributor

Upon further review when working on the python-examples repo, I'd suggest a few things:

  1. There shouldn't be hurries about what parameters should be passed for Scope.__init__(), as that's a private method, and at the opentracing-python level we only care about the very public contract (different situation with basictracer-python, though).

  2. If helpful properties are going to be added to Scope (i.e. manager or finish_on_trace), I'd suggest we keep them in the BasicScope class, and as read-only (_manager with manager accessor, etc).

  3. The current BasicScope is not a neutral Scope, but one working along ThreadLocalScopeManager, so I'd suggest renaming it.

  4. If we want to keep the properties/accessors described in 2), we could keep the aforementioned BasicScope class (with ThreadLocalScope as its child). In this case, it would be helpful to have a ScopeManager.deactivate(Scope) method, so that the BasicScope class is re-used among ScopeManager implementations (else, which is also correct, get rid altogether of this class).

Let me know. Hopefully it helps! I will be pushing soon an update with the python-examples updated to try out this effort!

@palazzem
Copy link
Member Author

@carlosalberto since my work has been already merged, I think we should close this PR because it has been superseded by #64 ?

@carlosalberto
Copy link
Contributor

@palazzem Yes. Closing it now. Thanks ;)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants