-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve error messages while downcasting UInt32Array
, UInt64Array
and BooleanArray
#4261
improve error messages while downcasting UInt32Array
, UInt64Array
and BooleanArray
#4261
Conversation
// Downcast ArrayRef to UInt32Array | ||
pub fn as_uint32_array(array: &dyn Array) -> Result<&UInt32Array, DataFusionError> { | ||
array.as_any().downcast_ref::<UInt32Array>().ok_or_else(|| { | ||
DataFusionError::Internal(format!( | ||
"Expected a UInt32Array, got: {}", | ||
array.data_type() | ||
)) | ||
}) | ||
} | ||
|
||
// Downcast ArrayRef to UInt64Array | ||
pub fn as_uint64_array(array: &dyn Array) -> Result<&UInt64Array, DataFusionError> { | ||
array.as_any().downcast_ref::<UInt64Array>().ok_or_else(|| { | ||
DataFusionError::Internal(format!( | ||
"Expected a UInt64Array, got: {}", | ||
array.data_type() | ||
)) | ||
}) | ||
} | ||
|
||
// Downcast ArrayRef to BooleanArray | ||
pub fn as_boolean_array(array: &dyn Array) -> Result<&BooleanArray, DataFusionError> { | ||
array | ||
.as_any() | ||
.downcast_ref::<BooleanArray>() | ||
.ok_or_else(|| { | ||
DataFusionError::Internal(format!( | ||
"Expected a BooleanArray, got: {}", | ||
array.data_type() | ||
)) | ||
}) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a macro that you can use instead of writing these methods:
let array = downcast_value!(values, Int32Array)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should revert all pr from this issue #3152?
cc @alamb @andygrove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, that is embarrassing 🤦
Interestingly the downcast_value
doesn't appear to be used in that many places (less than 10 modules at this time):
https://github.com/search?q=repo%3Aapache%2Farrow-datafusion+downcast_value&type=code
I would prefer not to roll back the PRs as they have already simplified the code non trivially.
What is important in my opinion is to use a standard pattern to do this downcasting. I don't have a huge preference between downcast_value
and as_boolean_array
, though the as_boolean_array
might be more discoverable in an IDE that autocompletes
If we are worried about code duplication, perhaps we can use downcast_value
to implement the as_boolean_array
, type methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good improvement that we can merge and then work on unifying with the downcast macro later. If people feel strongly about the downcast macro I can do that too
Thank you @retikulum @andygrove and @liukun4515 please share your thoughts |
Thanks for the feedback. I just didn't know about this macro. I can refactor this PR right after other reviewers' thoughts. Moreover, I will refactor previous PRs in a time series with bigger PRs if it is okay |
Me neither 😅 which is embarrassing when I look at the github blame history 🤦 |
Unless there are any other comments, I will plan to merge this PR tomorrow and we can continue the progress to clean up the code |
Benchmark runs are scheduled for baseline = 880e6fc and contender = 712b9fd. 712b9fd is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
It's a good idea to implement |
Will we keep using |
What is the last decision ? @alamb @liukun4515 |
I think that is fine (or maybe we can remove duplication by creating a macro that creates the entire function It would be good to eventually remove all uses of |
Which issue does this PR close?
Part of #3152.
Rationale for this change
This is the new PR of improving downcasting to
UInt32Array
,UInt64Array
,BooleanArray
. However, I couldn't refactor following part because error needs to be cast into ArrowError.https://github.com/apache/arrow-datafusion/blob/822022db88d4f70c2f02ac4d1828fc9413f3e252/datafusion/core/src/physical_plan/filter.rs#L212-L227
If you have any suggestion, I will be happy to implement it.
What changes are included in this PR?
as_uint32_array
,as_uint64_array
,as_boolean_array
is created indatafusion\common\src\cast.rs
Are there any user-facing changes?
I am not sure about it.