-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix][ml] Fix race conditions in RangeCache #22789
[fix][ml] Fix race conditions in RangeCache #22789
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #22789 +/- ##
============================================
- Coverage 73.57% 73.27% -0.31%
- Complexity 32624 32648 +24
============================================
Files 1877 1889 +12
Lines 139502 141659 +2157
Branches 15299 15543 +244
============================================
+ Hits 102638 103800 +1162
- Misses 28908 29844 +936
- Partials 7956 8015 +59
Flags with carried forward coverage won't be shown. Click here to find out more.
|
managed-ledger/src/main/java/org/apache/bookkeeper/mledger/util/RangeCache.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice work
This PR contained a few issues. I have a follow up PR #22814 to address the issues. Please review |
(cherry picked from commit c39f9f8)
(cherry picked from commit c39f9f8)
(cherry picked from commit c39f9f8)
Motivation
The RangeCache class contains several race conditions which cause instability.
When one thread removes the entry and another one uses it, that will become a problem.
The
cacheEvictionIntervalMs
setting is 10 ms by default. This results in theRangeCache.evictLEntriesBeforeTimestamp
method getting called about 100 times per second.The default expiration is
managedLedgerCacheEvictionTimeThresholdMillis
which is 1000 ms by default.It's also possible that 2 threads remove the entry at the same time.
ManagedLedgerImpl.invalidateEntriesUpToSlowestReaderPosition
will result in calls toRangeCache.removeRange
method. These calls happen independently of theRangeCache.evictLEntriesBeforeTimestamp
calls so there's a chance for race conditions.Modifications
.retain()
and.release()
callsremove(key, value)
method so that removals remove the correct value exactly onceDocumentation
doc
doc-required
doc-not-needed
doc-complete