-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent deletion of topics during recovery. Fixes #2329 (partially) #2388
Conversation
@confluentinc It looks like @agavra just signed our Contributor License Agreement. 👍 Always at your service, clabot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @agavra,
Is this a temporary fix until we put in a large fix that covers both topic creation as well as deletion?
If so, then it LGTM.
However, if this is a more long term fix then it feels to me that StatementExecutor
is the wrong place to be massaging commands to stop topics being deleted. The StatementExecutor
shouldn't care about such things. TBH it shouldn't know anything about recover at all, i.e. mode should not exist, but that's outside the scope of this PR!
I'm hoping the long term solution is to pass in a special topic client that tracks creations and deletions, so that we can calculate the set we need to ensure exist.
ksql-parser/src/main/java/io/confluent/ksql/parser/tree/AbstractStreamDropStatement.java
Outdated
Show resolved
Hide resolved
ksql-parser/src/main/java/io/confluent/ksql/parser/tree/AbstractStreamDropStatement.java
Outdated
Show resolved
Hide resolved
return false; | ||
} | ||
|
||
final AbstractStreamDropStatement that = (AbstractStreamDropStatement) o; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
equals
and hashCode
aren't taking super
state into account... is that intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we even need hashcode and equals? If we don't we should remove them. If we do, we should test them! (are you familiar with EqualsTester
?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super
doesn't actually have an implementation of either hashCode
/equals
. my guess is that we do actually need them since Node
(the superclass) explicitly makes them abstract
so that I need to implement it.
I am not familiar with EqualsTester
! I'll look into it and hopefully add another tool to my Java arsenal 🔫
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a feeling you can remove these methods and everything will still compile and work. There may be a few test failures, because the tests are relying on the broken implementations, but maybe they would be better switched over to using a custom Hamcrest Matcher?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EqualsTester
rocks. There's also one for NPE testing.
ksql-parser/src/main/java/io/confluent/ksql/parser/tree/DropStream.java
Outdated
Show resolved
Hide resolved
ksql-parser/src/main/java/io/confluent/ksql/parser/tree/DropTable.java
Outdated
Show resolved
Hide resolved
@big-andy-coates I totally agree that this breaks some abstraction barriers. The fix is semi-temporary in the sense that I've reviewed various different options (see discussion on linked issue #2329) and didn't really like any of them. The change you suggest, passing in a special "delayed" topic client, was one of the solutions that I considered but that also seemed a little hacky (and error-prone, since the topic client is leaked all over the code base). After discussing with @hjafarpour, we decided to just patch this up and delay a more fundamental rethink of the headless execution mode. The benefit of this approach is that it is very well contained and we can nuke this band-aid easily ;) |
07157e8
to
78cec76
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @agavra, LGTM.
I would say its probably worth trying to remove those dodging hashcode and equals impls and seeing if they are needed by production code. If they are needed, then implement them properly or use a different method. If its just tests then consider a custom Matcher.
Of course, your refactor hasn't changed this functionality, so I appreciate you may just want to move on. This is not a blocker to merging IMHO.
After giving this more thought, I realized that this PR trades a fix for one serious bug that happens rarely (deleting a topic a user created outside of KSQL) with a bug that happens every time we recover (re-creating topics that we deleted, and then not deleting them). I don't think this tradeoff is acceptable given that this PR is already a little hacky. Instead, I will go with @big-andy-coates's suggestion and implement a recovery specific topic client to lazily create topics. |
Description
This is a minimally invasive change to prevent us from deleting topics that were potentially recreated during recovery mode for the interactive KSQL server. Today, we replay all events in the case of a crash (including create/delete topics). This causes an issue if a user issues a delete, expects KSQL to relinquish ownership of that topic and creates it again outside of KSQL. In this scenario, recovery mode will delete the recreated topic.
This is only a partial fix. We still execute create topic commands during recovery, so it is possible that we will recreate a topic that the user manually deleted, and leave it there as a zombie topic. This is less serious and the scope of that change is much larger, so we will leave that for another time.
See #2329 for further discussion.
P.S. This also includes a small refactor of common functionality in
AbstractStreamDropCommand
Testing done
Reviewer checklist