-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-4583: [Plasma] Fix some small bugs reported by code scan tool #3656
Conversation
@guoyuhong Out of curiosity which "code scan tool" did you use? |
Coverity just started offering free scans to open source projects again. @pitrou can you review? |
cpp/src/arrow/util/logging.h
Outdated
@@ -55,7 +55,12 @@ enum class ArrowLogLevel : int { | |||
}; | |||
|
|||
#define ARROW_LOG_INTERNAL(level) ::arrow::util::ArrowLog(__FILE__, __LINE__, level) | |||
#define ARROW_LOG(level) ARROW_LOG_INTERNAL(::arrow::util::ArrowLogLevel::ARROW_##level) | |||
// clang-format off | |||
#define ARROW_LOG(level) if ( \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are lint tool conflicts here, so I disable clang-format
.
@pcmoritz Please take a look at this PR. |
Ping... @pcmoritz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this @guoyuhong . Here are some comments.
cpp/src/arrow/python/deserialize.cc
Outdated
int ret = PyDict_SetItem(result.obj(), PyList_GET_ITEM(keys.obj(), i - start_idx), | ||
PyList_GET_ITEM(vals.obj(), i - start_idx)); | ||
if (ret != 0) { | ||
return ConvertPyError(StatusCode::PythonError); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry for not being specific here. It's better to not override the second parameter, so that the proper status code is returned (for example a Python MemoryError
would be turned into a OutOfMemory
Status, etc.).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(attempting to fix it directly from Github...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I checked out common.h
. Yes, ConvertPyError()
is better.
Rebased, will wait for CI. |
Hmm, I really don't understand what the merge script is saying me here:
@wesm did something weird happen on git master? |
@pitrou master had a hiccup about 5 hours ago at 19:30 UTC; I accidentally created a merge commit in master (using the GitHub UI -- I have since asked INFRA to disallow merge commits through the UI) and quickly rebased and force-pushed master. It's possible that this PR was impacted. I have also seen the merge script misbehave at times like this when the GitHub API returns bad results now and then |
@wesm I've rebased it since that. |
Finally merged. Thanks @guoyuhong ! |
@pitrou Thanks! |
We used a static code scan tool to scan arrow code. There are several possible bugs: 1. The return value of `PyDict_SetItem` is not used. 2. Currently, `EventLoop:: Shutdown` should be called explicitly, which is error-prone and causing leak when the user forgets to call it. 3. There is an unclosed file descriptor in `io.cc` when path name is too long. Besides, we also made the following small changes: 1. When we use Plasma in Yarn and when a node uses too much memory, a SIGTERM signal will be sent to Plasma. Current plasma will exit silently. We also some log to plasma store to help us to debug. 2. `ARROW_LOG` will always evaluate the output expression even when it is not enabled, which is not efficient. 3. The constructor of Java class `ObjectStoreData` is private, which is not convenient when we want to create a mock plasma store. 4. Fix a call to `ObjectStoreData` which misplaces `meta` and `data` according to https://github.com/apache/arrow/blob/master/java/plasma/src/main/java/org/apache/arrow/plasma/ObjectStoreLink.java#L32 . Author: Yuhong Guo <[email protected]> Author: Antoine Pitrou <[email protected]> Closes apache#3656 from guoyuhong/fixPlasma and squashes the following commits: 634e36a <Antoine Pitrou> Use default argument value to `ConvertPyError` b547f2f <Yuhong Guo> remove if from ARROW_LOG 440f097 <Yuhong Guo> Address comment d3eb22f <Yuhong Guo> Lint 79b4af3 <Yuhong Guo> Fix and Lint 434a039 <Yuhong Guo> Make constructor of ObjectStoreData public b2ddba6 <Yuhong Guo> Fix misplace of meta and data in PlasmaClient.java a667402 <Yuhong Guo> Do not evaluate logging strings when logging is not enabled. 5be7b89 <Yuhong Guo> Fix unclosed fd reported by code scan tool 3a917d4 <Yuhong Guo> Fix not used return value in deserialize.cc reported by code scan tool ed56a48 <Yuhong Guo> Fix possible unclosed EventLoop reported by code scanning tool 3fed926 <Yuhong Guo> Add plasma log
We used a static code scan tool to scan arrow code. There are several possible bugs:
PyDict_SetItem
is not used.EventLoop:: Shutdown
should be called explicitly, which is error-prone and causing leak when the user forgets to call it.io.cc
when path name is too long.Besides, we also made the following small changes:
ARROW_LOG
will always evaluate the output expression even when it is not enabled, which is not efficient.ObjectStoreData
is private, which is not convenient when we want to create a mock plasma store.ObjectStoreData
which misplacesmeta
anddata
according to https://github.com/apache/arrow/blob/master/java/plasma/src/main/java/org/apache/arrow/plasma/ObjectStoreLink.java#L32 .