Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempt to debug & fix TestReconnectBufferedUNIX for real #421

Merged
merged 1 commit into from
Mar 22, 2018

Conversation

asf-stripe
Copy link
Contributor

@asf-stripe asf-stripe commented Mar 22, 2018

Summary

This PR hopefully 💯% fixes the TestReconnectBufferedUNIX for good. Double non-determinism in select default branches means we have to loop for the outer as well as the inner result.

What happened was: Flushing a TraceBackend happens in two steps:

  • One, FlushAsync kicks off the flush in the background. If the goroutine is busy/not reading from the chan, this can return ErrWouldBlock. We handled this case!
  • Two, the background goroutine will loop over flushable backends and try to kick off the flush on each. This also errs towards not flushing & setting ErrWouldBlock if the backend is busy. We didn't handle this case!

The part of the test that this was in was expecting a "real" error (not ErrWouldBlock) from the flush, and so in the relatively rare cases that ErrWouldBlock was returned, it would not have kicked off the flush and the later parts of the test would fail. Now, we handle both cases, so the test should correctly be in the right state always.

Motivation

This broke travis ci builds a bunch.

Test plan

Ran Travis CI tests a bunch

Rollout/monitoring/revert plan

Just merge, no changelog necessary.

@asf-stripe asf-stripe force-pushed the asf-more-testreconnect-buffered-unix-debugging branch from 0567a66 to 6a9df00 Compare March 22, 2018 18:55
@asf-stripe asf-stripe changed the title [WIP] attempt to debug & fix TestReconnectBufferedUNIX for real Attempt to debug & fix TestReconnectBufferedUNIX for real Mar 22, 2018
@stripe-ci
Copy link

Gerald Rule: Copy Observability on Veneur and Unilog pull requests

cc @stripe/observability
cc @stripe/observability-stripe

Turns out it was possible that a flush could be kicked off but could
not reach the flushBackend, returning yet another ErrWouldBlock (which
counted as an error, but broke assumptions further down). Now, we loop
until both the outer and inner flush have succeeded, which should
ensure this test succeeds on Travis.
@asf-stripe asf-stripe force-pushed the asf-more-testreconnect-buffered-unix-debugging branch from 6a9df00 to 3bc424b Compare March 22, 2018 19:08
@asf-stripe
Copy link
Contributor Author

R, @aditya-stripe?

@ChimeraCoder
Copy link
Contributor

lgtm

@asf-stripe asf-stripe merged commit 6f0d2de into master Mar 22, 2018
@asf-stripe asf-stripe deleted the asf-more-testreconnect-buffered-unix-debugging branch March 22, 2018 20:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants