-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ksql 5716/close transient queries when network splits #6905
Ksql 5716/close transient queries when network splits #6905
Conversation
@confluentinc It looks like @swist just signed our Contributor License Agreement. 👍 Always at your service, clabot |
12fb072
to
bc91031
Compare
...app/src/main/java/io/confluent/ksql/rest/server/resources/streaming/WebSocketSubscriber.java
Outdated
Show resolved
Hide resolved
...est-app/src/main/java/io/confluent/ksql/rest/server/resources/streaming/WSQueryEndpoint.java
Show resolved
Hide resolved
@swist is it feasible to add test coverage in |
bc91031
to
7a68084
Compare
final String msg = String.format("Websocket %s closed, reason: %s, code: %s)", | ||
websocket.textHandlerID(), | ||
websocket.closeReason(), | ||
websocket.closeStatusCode()); | ||
log.debug(msg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend using the log.debug
parameters to pass variables to the string. Like:
log.debug("Websocket {} closed, reason: {}, code: {})",
websocket.textHandlerID(),
websocket.closeReason(),
websocket.closeStatusCode());
Passing the variables to the LOG instead of using the String.format() has the benefit that if DEBUG is disabled (in most cases), then the string will not be formatted. However, if using String.format(), then the string is always formatted even if the log doesn't print it because debug is disabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, was looking for something like this!
private void attachCloseHandler(final ServerWebSocket websocket, | ||
final WebSocketSubscriber<?> subscriber) { | ||
websocket.closeHandler(v -> { | ||
subscriber.close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can subscriber be null? I see that validation in the code you removed. Should it be checked before creating the closeHandler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There isn't really a path that makes it null, but it makes sense to be more defensive there, I'll bring back the check!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix @swist ! LGTM pending the comments Colin and Sergio have left.
This is actually a pretty scary bug. WSQueryEndpoint is a singleton, but each call to executeStreamQuery handles a new websocket connection. What ended up happening is whenever we handled a query, we updated the subscription on the singleton. As a result whenever a websocket closed, we only closed the query that was currently attached to WSQueryEndpoint and lexically closed in the lambda passed into ServerWebsocket.handleClose
7a68084
to
3ea65b6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
I'm going to merge this without the test for the private method behaviour, but will have a look at refactoring the websocket code a bit later |
Description
We've observed a behaviour where if there are two concurrent PushQueries running against a KSQL cluster, upon terminating the websocket in the browser, one of the queries lingers.
Testing done
This is a right pain to test because everything useful gets mocked, so I'm more than happy to take pointers about testing this unit-test wise. I have tested that the behaviour is now correct locally by doing the following:
Prior to the change one of the queries would always stay when running
show queries
from another client. After the change it does not. Body of commit 12fb072 has a decent explanation why this might be the caseReviewer checklist