-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-38770: [C++][Python] RecordBatch.filter() segfaults if passed a ChunkedArray #40971
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Hmm... if the crash occurs due to a defect in the C++ code, perhaps this can be fixed on the C++ side? (or at least the C++ side could return a proper error) |
Ah yes, you are correct. That is also what Joris ment in the issue thread. Will correct. |
@pitrou I have moved the check to C++ but changed the logic to raise an error in case of a chunked mask. This way the bug will be resolved quicker as I would need to figure out some things in order to use |
cpp/src/arrow/compute/kernels/vector_selection_filter_internal.cc
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
After merging your PR, Conbench analyzed the 7 benchmarking runs that have been run so far on merge-commit d83af8f. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 24 possible false positives for unstable benchmarks that are known to sometimes produce them. |
…d a ChunkedArray (apache#40971) ### Rationale for this change Filtering a record batch with a boolean mask in the form of a `ChunkedArray` results in a segmentation fault. ### What changes are included in this PR? In case chunked array is passed as a mask to filter record batch, the code path for `pa.Table.filter()` is taken resulting in a filtered table. ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * GitHub Issue: apache#38770 Authored-by: AlenkaF <[email protected]> Signed-off-by: AlenkaF <[email protected]>
Rationale for this change
Filtering a record batch with a boolean mask in the form of a
ChunkedArray
results in a segmentation fault.What changes are included in this PR?
In case chunked array is passed as a mask to filter record batch, the code path for
pa.Table.filter()
is taken resulting in a filtered table.Are these changes tested?
Yes.
Are there any user-facing changes?
No.
RecordBatch.filter()
segfaults if passed aChunkedArray
#38770