Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

httpbp: Fix CircuitBreaker #428

Merged
merged 1 commit into from
Sep 23, 2021
Merged

httpbp: Fix CircuitBreaker #428

merged 1 commit into from
Sep 23, 2021

Conversation

fishy
Copy link
Member

@fishy fishy commented Sep 23, 2021

Fix one bug: when status code >= 500 and closing the body does not error
(it won't error in majority of cases), we do not actually report error
to the circuit breaker.

Also fix a few inefficiencies:

  1. When loaded is true, newBreaker is not used, so we can return it back
    to the pool immediately, instead of defer'd.
  2. Use DrainAndClose instead of ReadAll and Close, as
    io.Copy(io.Discard) is more efficient than ReadAll.
  3. Use pointer for the breaker in the pool and the map.

Also avoid dealing with interface{} from breaker.Execute, and add a test
to CircuitBreaker.

@fishy fishy requested a review from a team as a code owner September 23, 2021 00:04
@fishy fishy requested review from kylelemons and konradreiche and removed request for a team September 23, 2021 00:04
@@ -150,28 +150,27 @@ func CircuitBreaker(config breakerbp.Config) ClientMiddleware {
newBreaker := pool.Get()
breaker, loaded := breakers.LoadOrStore(host, newBreaker)
if loaded {
defer pool.Put(newBreaker)
pool.Put(newBreaker)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe also zero out newBreaker (defensively)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

everything in the breaker is unexported so there's no way to zero it out. also I'd rather trust sync.Map.LoadOrStore is implemented correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean newBreaker = nil so that if someone uses it, it will fail instead of causing a race

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that totally defeats the purpose of using a pool.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're missing my point, I think.

If you put newbreaker into the pool, and then someone accidentally uses it, it is a race condition.

Either leave the defer, which doesn't have this issue, or nil out the variable so that it can't be misused.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are missing the point of this code.

here we want to get the breaker out of the map (sync.Map), with the requirement that:

  1. two calls with the same key should always get the same breaker out of the map.
  2. two calls with different keys should always get the different breakers out of the map.

sync.Map.Load does not work, because it will return nil if this is the first ever call to this key, and then you need to initialize the breaker and store it back to the map, and that's no longer atomic.

this is why to use LoadOrStore here. with LoadOrStore, you always need a new breaker standing by. if this is the first call with this key, the new breaker will be stored into the map and returned back to you. if this is not the first call with this key, the breaker stored by the first call will be returned, and the new breaker is not used.

this is why we can put newbreaker back to the pool if loaded is true.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate your explanation; this is how I understood the code already, so this doesn't change my stance. Let me try to be more clear:

I am not concerned about the correctness of the code today, I am concerned about potential bugs that could be introduced in the future.

One of the reasons that defer is such a powerful language feature is that it can prevent future changes of the code from introducing subtle bugs. With defer here, the value stored in newBreaker is not returned to the pool until the function is returning, and so if someone in 6 months accidentally decides to use it, it's a logical bug, but not a data race. If you replace the value early, then there is a value stored in the newBreaker variable that is still in scope here and in the pool -- so there is potential for misuse, because it could be returned by the pool to another goroutine.

There are few different classes of fix:

  • Remove the value from the scope -- i.e. set newBreaker to nil after storing its value in the pool
  • Remove the variable from the scope -- i.e. make a local closure or anonymous scope to ensure newBreaker is only in scope while doing the (get+loadOrStore)
  • Keep the defer so that a future misuse of the variable does not introduce a race

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I didn't realize that you meant set it to nil after stored it back to the pool.

fwiw the first 2 options are legit, the 3rd one (defer) is not. we should only store unused breakers back to the pool, never used ones. (breakers are safe for concurrent use, so race condition is not the concern, storing a used one is).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also fwiw I already wrote a library for the enclosed version: https://pkg.go.dev/go.yhsif.com/defaultdict

var resp *http.Response
_, err := breaker.(breakerbp.FailureRatioBreaker).Execute(func() (interface{}, error) {
var err error
resp, err = next.RoundTrip(req)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't leak values out of the current scope before the error has been checked. This has resulted in hours or days of debugging in the past, since a partial value can leak FAR from where it is created before it causes an issue, and the change that triggers it is almost always very far from where it gets detected.

r, err := ...
if err != nil { ... }
resp = r

@@ -150,28 +150,27 @@ func CircuitBreaker(config breakerbp.Config) ClientMiddleware {
newBreaker := pool.Get()
breaker, loaded := breakers.LoadOrStore(host, newBreaker)
if loaded {
defer pool.Put(newBreaker)
pool.Put(newBreaker)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean newBreaker = nil so that if someone uses it, it will fail instead of causing a race

@@ -150,28 +150,27 @@ func CircuitBreaker(config breakerbp.Config) ClientMiddleware {
newBreaker := pool.Get()
breaker, loaded := breakers.LoadOrStore(host, newBreaker)
if loaded {
defer pool.Put(newBreaker)
pool.Put(newBreaker)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're missing my point, I think.

If you put newbreaker into the pool, and then someone accidentally uses it, it is a race condition.

Either leave the defer, which doesn't have this issue, or nil out the variable so that it can't be misused.

@fishy fishy force-pushed the httpbp-breaker branch 2 times, most recently from 950fe67 to 5bc1431 Compare September 23, 2021 01:57
Fix one bug: when status code >= 500 and closing the body does not error
(it won't error in majority of cases), we do not actually report error
to the circuit breaker.

Also fix a few inefficiencies:

1. When loaded is true, newBreaker is not used, so we can return it back
   to the pool immediately, instead of defer'd.
2. Use DrainAndClose instead of ReadAll and Close, as
   io.Copy(io.Discard) is more efficient than ReadAll.
3. Use pointer for the breaker in the pool and the map.

Also avoid dealing with interface{} from breaker.Execute, and add a test
to CircuitBreaker.
@fishy
Copy link
Member Author

fishy commented Sep 23, 2021

💇

@fishy fishy merged commit 6e19f3f into reddit:master Sep 23, 2021
@fishy fishy deleted the httpbp-breaker branch September 23, 2021 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants