Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent deletion of topics during recovery. Fixes #2329 (partially) #2388

Closed
wants to merge 2 commits into from

Conversation

agavra
Copy link
Contributor

@agavra agavra commented Jan 23, 2019

Description

This is a minimally invasive change to prevent us from deleting topics that were potentially recreated during recovery mode for the interactive KSQL server. Today, we replay all events in the case of a crash (including create/delete topics). This causes an issue if a user issues a delete, expects KSQL to relinquish ownership of that topic and creates it again outside of KSQL. In this scenario, recovery mode will delete the recreated topic.

This is only a partial fix. We still execute create topic commands during recovery, so it is possible that we will recreate a topic that the user manually deleted, and leave it there as a zombie topic. This is less serious and the scope of that change is much larger, so we will leave that for another time.

See #2329 for further discussion.

P.S. This also includes a small refactor of common functionality in AbstractStreamDropCommand

Testing done

  • Unit testing
  • Restarting a custom KSQL server and attaching a debugger to validate end-to-end

Reviewer checklist

  • Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
  • Ensure relevant issues are linked (description should include text like "Fixes #")

@agavra agavra requested review from rodesai and a team January 23, 2019 18:33
@ghost
Copy link

ghost commented Jan 23, 2019

@confluentinc It looks like @agavra just signed our Contributor License Agreement. 👍

Always at your service,

clabot

Copy link
Contributor

@big-andy-coates big-andy-coates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @agavra,

Is this a temporary fix until we put in a large fix that covers both topic creation as well as deletion?

If so, then it LGTM.

However, if this is a more long term fix then it feels to me that StatementExecutor is the wrong place to be massaging commands to stop topics being deleted. The StatementExecutor shouldn't care about such things. TBH it shouldn't know anything about recover at all, i.e. mode should not exist, but that's outside the scope of this PR!

I'm hoping the long term solution is to pass in a special topic client that tracks creations and deletions, so that we can calculate the set we need to ensure exist.

return false;
}

final AbstractStreamDropStatement that = (AbstractStreamDropStatement) o;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

equals and hashCode aren't taking super state into account... is that intentional?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we even need hashcode and equals? If we don't we should remove them. If we do, we should test them! (are you familiar with EqualsTester?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super doesn't actually have an implementation of either hashCode/equals. my guess is that we do actually need them since Node (the superclass) explicitly makes them abstract so that I need to implement it.

I am not familiar with EqualsTester! I'll look into it and hopefully add another tool to my Java arsenal 🔫

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a feeling you can remove these methods and everything will still compile and work. There may be a few test failures, because the tests are relying on the broken implementations, but maybe they would be better switched over to using a custom Hamcrest Matcher?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EqualsTester rocks. There's also one for NPE testing.

@agavra
Copy link
Contributor Author

agavra commented Jan 24, 2019

@big-andy-coates I totally agree that this breaks some abstraction barriers. The fix is semi-temporary in the sense that I've reviewed various different options (see discussion on linked issue #2329) and didn't really like any of them. The change you suggest, passing in a special "delayed" topic client, was one of the solutions that I considered but that also seemed a little hacky (and error-prone, since the topic client is leaked all over the code base).

After discussing with @hjafarpour, we decided to just patch this up and delay a more fundamental rethink of the headless execution mode. The benefit of this approach is that it is very well contained and we can nuke this band-aid easily ;)

@agavra agavra requested a review from a team January 24, 2019 00:57
Copy link
Contributor

@big-andy-coates big-andy-coates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @agavra, LGTM.

I would say its probably worth trying to remove those dodging hashcode and equals impls and seeing if they are needed by production code. If they are needed, then implement them properly or use a different method. If its just tests then consider a custom Matcher.

Of course, your refactor hasn't changed this functionality, so I appreciate you may just want to move on. This is not a blocker to merging IMHO.

@big-andy-coates big-andy-coates requested a review from a team January 24, 2019 10:35
@agavra
Copy link
Contributor Author

agavra commented Jan 24, 2019

After giving this more thought, I realized that this PR trades a fix for one serious bug that happens rarely (deleting a topic a user created outside of KSQL) with a bug that happens every time we recover (re-creating topics that we deleted, and then not deleting them). I don't think this tradeoff is acceptable given that this PR is already a little hacky. Instead, I will go with @big-andy-coates's suggestion and implement a recovery specific topic client to lazily create topics.

@agavra agavra closed this Jan 24, 2019
@agavra agavra deleted the no_delete_recovery branch January 24, 2019 18:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants