Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent leaking goroutines from websocket connections #253

Merged

Conversation

amerry
Copy link
Contributor

@amerry amerry commented Oct 5, 2018

The write to the CloseNotify() channel was blocking, preventing the
Read() call from exiting, and preventing ServeHTTP from returning. This
meant that requests tended to stay around forever in a deadlocked state.

Using the context package prevent this, as the cancellation is
non-blocking. It depends on Go 1.7.

Note that the writer still needs to support the CloseNotifier interface,
otherwise other things break, but it does not seem to be necessary to
signal using it.

The write to the CloseNotify() channel was blocking, preventing the
Read() call from exiting, and preventing ServeHTTP from returning. This
meant that requests tended to stay around forever in a deadlocked state.

Using the context package prevent this, as the cancellation is
non-blocking. It depends on Go 1.7.

Note that the writer still needs to support the CloseNotifier interface,
otherwise other things break, but it does not seem to be necessary to
signal using it.
Copy link
Contributor

@johanbrandhorst johanbrandhorst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this contribution! Just a few questions.

@@ -148,14 +149,17 @@ func (w *WrappedGrpcServer) handleWebSocket(wsConn *websocket.Conn, req *http.Re
return
}

ctx, cancelFunc := context.WithCancel(req.Context())
defer cancelFunc()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to explicitly cancel this? Can we not rely on the built-in request context being cancelled?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can rely on the request context we don't need to pass around a cancelFunc either. I assume you've tested some of this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we've got a websocket client that makes a request every second, so the memory leak is really obvious, and I put in some logging for testing (tracking the start and end of handleWebSocket, as well as before and after the cancellation step in the reader).

I tested a bunch of different configurations. Not cancelling the request context also prevented connections from closing properly; I'm not sure why (you would expect w.server.ServeHTTP to pick up from the error returned from the Body that it would need to cancel the request, but this doesn't seem to happen).

I'll admit that I didn't go diving into the lower-level code to see what was going on - this was mostly by trial-and-error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - I'm not worried that this will cause any real problems, since the request is over once this function has exited, just thought it was a bit overzealous, but if your testing implies the necessity, I'm OK with it!

@@ -148,14 +149,17 @@ func (w *WrappedGrpcServer) handleWebSocket(wsConn *websocket.Conn, req *http.Re
return
}

ctx, cancelFunc := context.WithCancel(req.Context())
defer cancelFunc()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - I'm not worried that this will cause any real problems, since the request is over once this function has exited, just thought it was a bit overzealous, but if your testing implies the necessity, I'm OK with it!

@johanbrandhorst
Copy link
Contributor

Would like a confirmation from @MarcusLongmuir that we're OK requiring Go 1.7+ for building this now.

@amerry
Copy link
Contributor Author

amerry commented Oct 5, 2018

Oh, wait, I think I misunderstood. That defer there isn't necessary to prevent the goroutine leaking. I just put it in based on the context.WithCancel documentation, which seemed to suggest it was good practice.

It's the other cancel call that's necessary - the one in the reader code.

@johanbrandhorst
Copy link
Contributor

Let me rephrase: can you categorically confirm that just removing the channel doesn't work? I imagine the request context will be cancelled when the client finishes reading.

@amerry
Copy link
Contributor Author

amerry commented Oct 5, 2018

Yes, I can confirm that.

@johanbrandhorst
Copy link
Contributor

Alright well then this seems an appropriate fix :). Lets wait for CI to give the go ahead.

@johanbrandhorst
Copy link
Contributor

2 failures, I've optimistically rerun them because I don't know if they were just flaky browser tests.

@johanbrandhorst
Copy link
Contributor

Going to give the last error another go 😬

@johanbrandhorst
Copy link
Contributor

Just going to give @MarcusLongmuir and @easyCZ a couple of days to put anymore comments on this but LGTM!

@johanbrandhorst
Copy link
Contributor

OK I think we've had enough time to think about this, going to merge, thanks for your contribution!

@johanbrandhorst johanbrandhorst merged commit 1464bd9 into improbable-eng:master Oct 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants