Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic when not able to delete compaction waste #573

Merged
merged 4 commits into from
Nov 8, 2024

Conversation

wi11dey
Copy link
Contributor

@wi11dey wi11dey commented Nov 7, 2024

  • A failed rollback can leave behind a compaction product with data; if we continue to compact in this case, a tombstone can come in for some data in the failed product, expire, and leave no tombstone and no data, except in the product. The data in the failed product will be resurrected in the restart.
  • A similar situation can occur if we fail to write to the write-ahead log and leave the ancestors after the view has shifted to the finalized products. These ancestors will not be wiped on the next restart because they were not written to the write-ahead log.

In both cases, we can prevent resurrection by refusing to compact after this case has been encountered, but this will cause the node to fall behind on compactions which will take longer to resolve than a bounce. We haven't observed either of these cases directly, which indicates that this failure mode is infrequent enough that a forced restart is an acceptable alternative to stopping compactions that doesn't risk extended perf degradation.

@wi11dey wi11dey force-pushed the wdey/panic-on-non-delete branch 2 times, most recently from 12d80d8 to 6199665 Compare November 7, 2024 20:27
@wi11dey wi11dey merged commit cfbbb42 into palantir-cassandra-2.2.18 Nov 8, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants