Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DirectSender.Flush() not performing a full flush #25

Open
prydin opened this issue Apr 24, 2019 · 3 comments
Open

DirectSender.Flush() not performing a full flush #25

prydin opened this issue Apr 24, 2019 · 3 comments

Comments

@prydin
Copy link
Collaborator

prydin commented Apr 24, 2019

When DirectSender.Flush() is called, it calls Flush() on the underlying LineHandlers. LineHandler.Flush() only performs a flush of a single batch, leaving the rest of the buffer in memory. Typically, Flush() on something means that you'd block until all data is written to some backend storage. I suggest we change the code to call FlushAll() on the LineHandlers instead, since this would be more in line with typical semantics of a Flush() call.

If we are worried about flooding Wavefront with points, we may want to offer a FlushPartial() or FlushCurrentBatch(), but, in my opinion, it's confusing to have Flush() do something that's not inline with what programmers typically expect.

Even if we don't change the behavior of Flush() the senders need to offer a FlushAll(). This is crucial if you have a data collector that doesn't run continuously, e.g. something that's started from cron, does its job and then dies. Such a program needs a way to ensure that everything is written to the backend before it exits.

@LukeWinikates
Copy link
Contributor

I would like to go ahead and make this change.

I am thinking:

  • look at the places where we call Flush() in the SDK today, and understand which of those want to flush a batch and which would be ok with Flush() in the traditional sense.
  • introduce FlushBatch() or flushBatch(), and call that where appropriate
  • change the semantics of Flush() to do a full flush
  • call out the semantic change in the docs

@LukeWinikates
Copy link
Contributor

additional thoughts from talking with @priyaselvaganesan and looking at the equivalent code in the Java SDK.

It looks like the flush() method in the Java SDK may also be only flushing batches, not performing a full flush.

We wonder about the safety of performing a full flush. What if the SDK is sending to a proxy that is already overwhelmed? Will the proxy throttle those requests? Will it create backpressure? Would we want flush to block indefinitely if the proxy is making us wait? Or would this not happen because the SDK buffer can only be so big anyway?

What if the flush takes a really long time? Should there be a timeout?

Thinking about renaming Flush to FlushBatch, with the current semantics, and possibly adding a full Flush option later (possibly with a context parameter so we can enforce a timeout).

@LukeWinikates
Copy link
Contributor

Another thought:

I'm pretty sure Flush() will re-queue failed metrics that it thinks will be sendable in future batches. So if we naively made FlushAll()by repeatedly calling the current Flush() until the buffer was empty, I think we could get into an infinite loop.

Maybe failed lines need a retry limit. Maybe FlushAll() has a different implementation from Flush(), and doesn't re-queue metrics that failed to send.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants