-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix][ML]Fix NPE when put value to RangeCache
.
#15707
[fix][ML]Fix NPE when put value to RangeCache
.
#15707
Conversation
RangeCache
.RangeCache
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch!
This may be a core cause of topic fenced, but we are still confirming. |
managed-ledger/src/main/java/org/apache/bookkeeper/mledger/util/RangeCache.java
Outdated
Show resolved
Hide resolved
managed-ledger/src/main/java/org/apache/bookkeeper/mledger/util/RangeCache.java
Outdated
Show resolved
Hide resolved
### Motivation When `ReferenceCounted` object overrides the method `deallocate` to make the `getLength` value equal null will cause NPE because the `RangeCache#put` method is not thread-safe. (The process of describing this abstraction is not very clear, please refer to the code modification :) Pulsar implementation may throw an exception to make `OpAddEntry` fail abnormal and fence the topic. relative code as below: https://github.com/apache/pulsar/blob/defeec0e84a63ea865f3a2790bc61b66a02254c5/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpAddEntry.java#L211-L217 **Exception screenshot:** ```java java.lang.NullPointerException: Cannot invoke "String.length()" because "value.s" is null at org.apache.bookkeeper.mledger.util.RangeCacheTest.lambda$testInParallel$6(RangeCacheTest.java:279) at org.apache.bookkeeper.mledger.util.RangeCache.put(RangeCache.java:77) at org.apache.bookkeeper.mledger.util.RangeCacheTest.testInParallel(RangeCacheTest.java:283) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) at java.base/java.lang.reflect.Method.invoke(Method.java:577) at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:132) at org.testng.internal.TestInvoker.invokeMethod(TestInvoker.java:599) at org.testng.internal.TestInvoker.invokeTestMethod(TestInvoker.java:174) at org.testng.internal.MethodRunner.runInSequence(MethodRunner.java:46) at org.testng.internal.TestInvoker$MethodInvocationAgent.invoke(TestInvoker.java:822) at org.testng.internal.TestInvoker.invokeTestMethods(TestInvoker.java:147) at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:146) at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:128) at java.base/java.util.ArrayList.forEach(ArrayList.java:1511) at org.testng.TestRunner.privateRun(TestRunner.java:764) at org.testng.TestRunner.run(TestRunner.java:585) at org.testng.SuiteRunner.runTest(SuiteRunner.java:384) at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:378) at org.testng.SuiteRunner.privateRun(SuiteRunner.java:337) at org.testng.SuiteRunner.run(SuiteRunner.java:286) at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:53) at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:96) at org.testng.TestNG.runSuitesSequentially(TestNG.java:1218) at org.testng.TestNG.runSuitesLocally(TestNG.java:1140) at org.testng.TestNG.runSuites(TestNG.java:1069) at org.testng.TestNG.run(TestNG.java:1037) at com.intellij.rt.testng.IDEARemoteTestNG.run(IDEARemoteTestNG.java:66) at com.intellij.rt.testng.RemoteTestNGStarter.main(RemoteTestNGStarter.java:109) ``` ### Modifications - Make the `RangeCache#put` method to thread-safe. (cherry picked from commit b155d84)
### Motivation When `ReferenceCounted` object overrides the method `deallocate` to make the `getLength` value equal null will cause NPE because the `RangeCache#put` method is not thread-safe. (The process of describing this abstraction is not very clear, please refer to the code modification :) Pulsar implementation may throw an exception to make `OpAddEntry` fail abnormal and fence the topic. relative code as below: https://github.com/apache/pulsar/blob/defeec0e84a63ea865f3a2790bc61b66a02254c5/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpAddEntry.java#L211-L217 **Exception screenshot:** ```java java.lang.NullPointerException: Cannot invoke "String.length()" because "value.s" is null at org.apache.bookkeeper.mledger.util.RangeCacheTest.lambda$testInParallel$6(RangeCacheTest.java:279) at org.apache.bookkeeper.mledger.util.RangeCache.put(RangeCache.java:77) at org.apache.bookkeeper.mledger.util.RangeCacheTest.testInParallel(RangeCacheTest.java:283) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) at java.base/java.lang.reflect.Method.invoke(Method.java:577) at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:132) at org.testng.internal.TestInvoker.invokeMethod(TestInvoker.java:599) at org.testng.internal.TestInvoker.invokeTestMethod(TestInvoker.java:174) at org.testng.internal.MethodRunner.runInSequence(MethodRunner.java:46) at org.testng.internal.TestInvoker$MethodInvocationAgent.invoke(TestInvoker.java:822) at org.testng.internal.TestInvoker.invokeTestMethods(TestInvoker.java:147) at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:146) at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:128) at java.base/java.util.ArrayList.forEach(ArrayList.java:1511) at org.testng.TestRunner.privateRun(TestRunner.java:764) at org.testng.TestRunner.run(TestRunner.java:585) at org.testng.SuiteRunner.runTest(SuiteRunner.java:384) at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:378) at org.testng.SuiteRunner.privateRun(SuiteRunner.java:337) at org.testng.SuiteRunner.run(SuiteRunner.java:286) at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:53) at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:96) at org.testng.TestNG.runSuitesSequentially(TestNG.java:1218) at org.testng.TestNG.runSuitesLocally(TestNG.java:1140) at org.testng.TestNG.runSuites(TestNG.java:1069) at org.testng.TestNG.run(TestNG.java:1037) at com.intellij.rt.testng.IDEARemoteTestNG.run(IDEARemoteTestNG.java:66) at com.intellij.rt.testng.RemoteTestNGStarter.main(RemoteTestNGStarter.java:109) ``` ### Modifications - Make the `RangeCache#put` method to thread-safe. (cherry picked from commit b155d84)
### Motivation When `ReferenceCounted` object overrides the method `deallocate` to make the `getLength` value equal null will cause NPE because the `RangeCache#put` method is not thread-safe. (The process of describing this abstraction is not very clear, please refer to the code modification :) Pulsar implementation may throw an exception to make `OpAddEntry` fail abnormal and fence the topic. relative code as below: https://github.com/apache/pulsar/blob/defeec0e84a63ea865f3a2790bc61b66a02254c5/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpAddEntry.java#L211-L217 **Exception screenshot:** ```java java.lang.NullPointerException: Cannot invoke "String.length()" because "value.s" is null at org.apache.bookkeeper.mledger.util.RangeCacheTest.lambda$testInParallel$6(RangeCacheTest.java:279) at org.apache.bookkeeper.mledger.util.RangeCache.put(RangeCache.java:77) at org.apache.bookkeeper.mledger.util.RangeCacheTest.testInParallel(RangeCacheTest.java:283) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) at java.base/java.lang.reflect.Method.invoke(Method.java:577) at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:132) at org.testng.internal.TestInvoker.invokeMethod(TestInvoker.java:599) at org.testng.internal.TestInvoker.invokeTestMethod(TestInvoker.java:174) at org.testng.internal.MethodRunner.runInSequence(MethodRunner.java:46) at org.testng.internal.TestInvoker$MethodInvocationAgent.invoke(TestInvoker.java:822) at org.testng.internal.TestInvoker.invokeTestMethods(TestInvoker.java:147) at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:146) at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:128) at java.base/java.util.ArrayList.forEach(ArrayList.java:1511) at org.testng.TestRunner.privateRun(TestRunner.java:764) at org.testng.TestRunner.run(TestRunner.java:585) at org.testng.SuiteRunner.runTest(SuiteRunner.java:384) at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:378) at org.testng.SuiteRunner.privateRun(SuiteRunner.java:337) at org.testng.SuiteRunner.run(SuiteRunner.java:286) at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:53) at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:96) at org.testng.TestNG.runSuitesSequentially(TestNG.java:1218) at org.testng.TestNG.runSuitesLocally(TestNG.java:1140) at org.testng.TestNG.runSuites(TestNG.java:1069) at org.testng.TestNG.run(TestNG.java:1037) at com.intellij.rt.testng.IDEARemoteTestNG.run(IDEARemoteTestNG.java:66) at com.intellij.rt.testng.RemoteTestNGStarter.main(RemoteTestNGStarter.java:109) ``` ### Modifications - Make the `RangeCache#put` method to thread-safe. (cherry picked from commit b155d84)
### Motivation When `ReferenceCounted` object overrides the method `deallocate` to make the `getLength` value equal null will cause NPE because the `RangeCache#put` method is not thread-safe. (The process of describing this abstraction is not very clear, please refer to the code modification :) Pulsar implementation may throw an exception to make `OpAddEntry` fail abnormal and fence the topic. relative code as below: https://github.com/apache/pulsar/blob/defeec0e84a63ea865f3a2790bc61b66a02254c5/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpAddEntry.java#L211-L217 **Exception screenshot:** ```java java.lang.NullPointerException: Cannot invoke "String.length()" because "value.s" is null at org.apache.bookkeeper.mledger.util.RangeCacheTest.lambda$testInParallel$6(RangeCacheTest.java:279) at org.apache.bookkeeper.mledger.util.RangeCache.put(RangeCache.java:77) at org.apache.bookkeeper.mledger.util.RangeCacheTest.testInParallel(RangeCacheTest.java:283) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) at java.base/java.lang.reflect.Method.invoke(Method.java:577) at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:132) at org.testng.internal.TestInvoker.invokeMethod(TestInvoker.java:599) at org.testng.internal.TestInvoker.invokeTestMethod(TestInvoker.java:174) at org.testng.internal.MethodRunner.runInSequence(MethodRunner.java:46) at org.testng.internal.TestInvoker$MethodInvocationAgent.invoke(TestInvoker.java:822) at org.testng.internal.TestInvoker.invokeTestMethods(TestInvoker.java:147) at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:146) at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:128) at java.base/java.util.ArrayList.forEach(ArrayList.java:1511) at org.testng.TestRunner.privateRun(TestRunner.java:764) at org.testng.TestRunner.run(TestRunner.java:585) at org.testng.SuiteRunner.runTest(SuiteRunner.java:384) at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:378) at org.testng.SuiteRunner.privateRun(SuiteRunner.java:337) at org.testng.SuiteRunner.run(SuiteRunner.java:286) at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:53) at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:96) at org.testng.TestNG.runSuitesSequentially(TestNG.java:1218) at org.testng.TestNG.runSuitesLocally(TestNG.java:1140) at org.testng.TestNG.runSuites(TestNG.java:1069) at org.testng.TestNG.run(TestNG.java:1037) at com.intellij.rt.testng.IDEARemoteTestNG.run(IDEARemoteTestNG.java:66) at com.intellij.rt.testng.RemoteTestNGStarter.main(RemoteTestNGStarter.java:109) ``` ### Modifications - Make the `RangeCache#put` method to thread-safe. (cherry picked from commit b155d84) (cherry picked from commit 8973baf)
### Motivation When `ReferenceCounted` object overrides the method `deallocate` to make the `getLength` value equal null will cause NPE because the `RangeCache#put` method is not thread-safe. (The process of describing this abstraction is not very clear, please refer to the code modification :) Pulsar implementation may throw an exception to make `OpAddEntry` fail abnormal and fence the topic. relative code as below: https://github.com/apache/pulsar/blob/defeec0e84a63ea865f3a2790bc61b66a02254c5/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpAddEntry.java#L211-L217 **Exception screenshot:** ```java java.lang.NullPointerException: Cannot invoke "String.length()" because "value.s" is null at org.apache.bookkeeper.mledger.util.RangeCacheTest.lambda$testInParallel$6(RangeCacheTest.java:279) at org.apache.bookkeeper.mledger.util.RangeCache.put(RangeCache.java:77) at org.apache.bookkeeper.mledger.util.RangeCacheTest.testInParallel(RangeCacheTest.java:283) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) at java.base/java.lang.reflect.Method.invoke(Method.java:577) at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:132) at org.testng.internal.TestInvoker.invokeMethod(TestInvoker.java:599) at org.testng.internal.TestInvoker.invokeTestMethod(TestInvoker.java:174) at org.testng.internal.MethodRunner.runInSequence(MethodRunner.java:46) at org.testng.internal.TestInvoker$MethodInvocationAgent.invoke(TestInvoker.java:822) at org.testng.internal.TestInvoker.invokeTestMethods(TestInvoker.java:147) at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:146) at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:128) at java.base/java.util.ArrayList.forEach(ArrayList.java:1511) at org.testng.TestRunner.privateRun(TestRunner.java:764) at org.testng.TestRunner.run(TestRunner.java:585) at org.testng.SuiteRunner.runTest(SuiteRunner.java:384) at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:378) at org.testng.SuiteRunner.privateRun(SuiteRunner.java:337) at org.testng.SuiteRunner.run(SuiteRunner.java:286) at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:53) at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:96) at org.testng.TestNG.runSuitesSequentially(TestNG.java:1218) at org.testng.TestNG.runSuitesLocally(TestNG.java:1140) at org.testng.TestNG.runSuites(TestNG.java:1069) at org.testng.TestNG.run(TestNG.java:1037) at com.intellij.rt.testng.IDEARemoteTestNG.run(IDEARemoteTestNG.java:66) at com.intellij.rt.testng.RemoteTestNGStarter.main(RemoteTestNGStarter.java:109) ``` ### Modifications - Make the `RangeCache#put` method to thread-safe. (cherry picked from commit b155d84) (cherry picked from commit 2e5c984)
MutableBoolean flag = new MutableBoolean(); | ||
entries.computeIfAbsent(key, (k) -> { | ||
size.addAndGet(weighter.getSize(value)); | ||
return true; | ||
} else { | ||
return false; | ||
} | ||
flag.setValue(true); | ||
return value; | ||
}); | ||
return flag.booleanValue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattisonchao This doesn't seem to be a thread safe change here. computeIfAbsent
doesn't lock the key when ConcurrentSkipListMap is used.
You can see this in the source code of ConcurrentMap
:
default V computeIfAbsent(K key,
Function<? super K, ? extends V> mappingFunction) {
Objects.requireNonNull(mappingFunction);
V oldValue, newValue;
return ((oldValue = get(key)) == null
&& (newValue = mappingFunction.apply(key)) != null
&& (oldValue = putIfAbsent(key, newValue)) == null)
? newValue
: oldValue;
}
On the other hand, ConcurrentHashMap.computeIfAbsent does lock the key and atomically call the mappingFunction. More details in a tweet where some observations by @michaeljmarshall were shared: https://twitter.com/spyced/status/1709677859764154819 .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattisonchao issue is #21301
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right.
Motivation
When
ReferenceCounted
object overrides the methoddeallocate
to make thegetLength
value equal null will cause NPE because theRangeCache#put
method is not thread-safe. (The process of describing this abstraction is not very clear, please refer to the code modification :)Pulsar implementation may throw an exception to make
OpAddEntry
fail abnormal and fence the topic. relative code as below:pulsar/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpAddEntry.java
Lines 211 to 217 in defeec0
Exception screenshot:
Modifications
RangeCache#put
method to thread-safe.Verifying this change
Documentation
no-need-doc