What's the story with threads? #3

NicolasT · 2017-09-30T10:08:22Z

I was trying to figure out how to handle concurrency (forkIO, async).

A potential API I came up with would be something like

data Reference = ChildOf | FollowsFrom

fork :: MonadTracing m => Reference -> Text -> m () -> m () -> m ThreadId

I implemented something along these lines (although my fork is Reference -> Text -> TracingT IO () -> TracingT IO ThreadId since I didn't generalize it), also introducing activeSpanContext :: Tracer -> IO (Maybe SpanContext) to the OpenTracing sig and implenting it in Jaeger accordingly.

Sadly enough, this does not work, because of the IORef (Maybe Span) being used in the Jaeger.Tracer type. What should be added instead, I guess, is some kind of dupTracer which creates a new IORef to be used within the thread, or something along those lines.

It may also require creating new sockets etc, or coordination between threads, because otherwise multiple threads could write to the same socket concurrently (may not be a problem with UDP transports, but would be with TCP-based tracing systems).

This seems to become messy quite quickly.

An alternative could be to have a single thread interacting with the tracing agent, and use some kind of queue to submit spans to it.

Related to this: need to figure out how to play nice with libraries like async. Getting things to 'Just Work' seems difficult, some new APIs will be required I believe. But ending up with something like

(a, b) <- runConcurrently $ (,) <$> Concurrent "getPage" (getPage url1) <*> Concurrent "getPage" (getPage url2)

where the sub-threads would ChildOf reference the main one could be sensible, staying close to the async APIs, and not requiring changes to getPage (which may be a library function, not tracing-enabled).

The text was updated successfully, but these errors were encountered:

ocharles · 2017-09-30T10:39:38Z

I haven't entirely thought it through yet, but we should check out how the Go library deals with this. From memory, I believe it's roughly about capturing the Span you're in, and then activating it in child threads.

ocharles · 2017-09-30T10:47:27Z

opentracing/specification#23 may discuss this. In general, it seems that this is not part of the OpenTracing spec, so i'd be inclined to just hold off. https://github.com/opentracing/opentracing-java/blob/master/opentracing-api/src/main/java/io/opentracing/ActiveSpan.java#L64 is capture in the Java API, which already diverges from the specification (or more, extends it).

NicolasT · 2017-09-30T10:51:41Z

A way forward, I think, could be to split 'current span' tracking etc (move it into TracingT?), and let an OpenTracing implementation only worry about creation of span instances, and emitting them.

Remains the shared use of tracerWriteBuffer, I guess?

Maybe a Tracer should have two parts of state:

Shared within a process (e.g. a socket)
Shared within a thread-of-execution (e.g. Thrift.WriteBuffer), and a new instance created upon fork
Not maintain 'current span' and thread-state as part of the Tracer, but as part of the MonadTracing environment, which can then take care of setting things up upon fork

ocharles · 2017-09-30T11:22:47Z

Well it wouldn't be much work to copy the Java API and have

data Continuation

captureActiveSpan :: Tracer -> IO (Maybe Continuation)
restore :: Tracer -> Continuation -> IO ()

But I need to give it a bit of thought, and might also need to think about these Maybes and see if there's anything else.

I'm a little cautious about committing to a particular monad or transformer though.

I'm not sure about the connection to tracerWriteBuffer, that's just an implementation detail of the Jaeger client to make sure all Thrift calls are delivery in a single UDP packet.

NicolasT · 2017-10-01T20:21:35Z

What I tried to say about the WriteBuffer is that having one as part of the Tracer structure, this basically makes the Tracer not thread-safe (or rather, can't be shared across threads assuming it's WriteBuffer is used by multiple threads concurrently).
A WriteBuffer is a IORef Builder with e.g.

writeBuf :: WriteBuffer -> LBS.ByteString -> IO ()
writeBuf w s = modifyIORef w ( <> lazyByteString s)

(https://git1-us-west.apache.org/repos/asf?p=thrift.git;a=blob;f=lib/hs/src/Thrift/Transport/IOBuffer.hs;hb=39310dad793ca69b4b7217a3b54430e682e5e2a4)
so I think (admittedly making some assumptions about how the Thrift library may work, given experience with the Python version of it) this should become part of per-thread Tracer state.

Thinking of it, doing a worker-thread(s?) + queue approach would

get rid of WriteBuffer sharing thread-safety issues
get rid of pitfalls with timely closing the Socket
allow to batch multiple spans into a single packet to be sent to the agent, which may be more efficient

NicolasT · 2017-10-01T20:32:56Z

Given the Continuation and Maybes story:

I believe there are 3 possible states:

Some code is running as part of a current (active) span
Some code is running (likely in a thread) and has no current (active) span, but does have a 'parent' span(context) which would becomes a reference (childof or followsfrom) whenever a span is created for the current thread
Some code is running and has no current nor parent span, e.g. at process startup

Not sure how to encode this ergonomically (wrt how a span is stored in the state) though. Will do some exploration-while-coding.

NicolasT · 2017-10-01T21:49:42Z

I was thinking of using some forall s. ST s-like trick to ensure a span can't be passed to a different execution context, in order to ensure a span with a specific spancontext won't (accidentally) be emitted twice (with different timings), but this would rule out the possibility to write code in which a span is started in one thread and closed in another (e.g. an RPC system where the server receives a client request, starts a corresponding span, pushes the request on a queue which is handled by some other thread, which then pushes the RPC result to yet another queue which is popped by a third thread to respond to a client, then closing the span).

ocharles · 2017-10-02T21:07:17Z

WriteBuffer can be trivially thread safe by moving to an MVar, so i'm not too worried about that. But really the Tracer should be forked to a separate thread with spans pushed to it for publishing, which means some kind of TQueue or something,

get rid of pitfalls with timely closing the Socket

pitfalls?

allow to batch multiple spans into a single packet to be sent to the agent, which may be more efficient

Certainly, and this comes from having the jaeger tracer fork a thread that periodically flushes.

ST like tricks seem like a rather heavy weight solution (it's a specialization of Oleg's region work), so i'd want to be sure we really need that.

abailly mentioned this issue May 21, 2018

Make haskell-opentracing work with recent Jaeger #4

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's the story with threads? #3

What's the story with threads? #3

NicolasT commented Sep 30, 2017

ocharles commented Sep 30, 2017

ocharles commented Sep 30, 2017

NicolasT commented Sep 30, 2017

ocharles commented Sep 30, 2017

NicolasT commented Oct 1, 2017 •

edited

Loading

NicolasT commented Oct 1, 2017

NicolasT commented Oct 1, 2017

ocharles commented Oct 2, 2017

What's the story with threads? #3

What's the story with threads? #3

Comments

NicolasT commented Sep 30, 2017

ocharles commented Sep 30, 2017

ocharles commented Sep 30, 2017

NicolasT commented Sep 30, 2017

ocharles commented Sep 30, 2017

NicolasT commented Oct 1, 2017 • edited Loading

NicolasT commented Oct 1, 2017

NicolasT commented Oct 1, 2017

ocharles commented Oct 2, 2017

NicolasT commented Oct 1, 2017 •

edited

Loading