Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't return dictionary encoded booleans #42

Merged
merged 8 commits into from
Sep 18, 2024

Conversation

samuelcolvin
Copy link
Collaborator

@samuelcolvin samuelcolvin commented Sep 7, 2024

Required by apache/datafusion#12380 and also apache/datafusion#12511.

But it's probably more efficient to operate on an array of bools than a dictionary anyway.

@samuelcolvin samuelcolvin changed the title allow json_contains in filter don't return dictionary encoded values. Sep 7, 2024
@adriangb
Copy link
Collaborator

adriangb commented Sep 8, 2024

I feel like what we need is something like:

use arrow_array::*;
use arrow_cast::cast;
use arrow_schema::DataType;
use arrow::error::ArrowError;

/// Convert a dict array to a non-dict array.
fn unpack_dict_array(array: ArrayRef) -> Result<ArrayRef, ArrowError> {
    match array.data_type() {
        DataType::Dictionary(_, value_type) => {
            cast(array.as_ref(), value_type)
        }
        _ => Ok(array)
    }
}

@samuelcolvin
Copy link
Collaborator Author

superceded (I think) by #43

@samuelcolvin samuelcolvin reopened this Sep 17, 2024
) -> DataFusionResult<ArrayRef> {
if return_dict {
if is_json_union(values.data_type()) {
// JSON union: post-process the array to set keys to null where the union member is null
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably move this logic into json_get since it only applies there.

cc @davidhewitt?

Copy link
Collaborator

@adriangb adriangb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me!

@samuelcolvin samuelcolvin changed the title don't return dictionary encoded values. Don't return dictionary encoded booleans Sep 18, 2024
@samuelcolvin samuelcolvin merged commit b1ca610 into main Sep 18, 2024
7 checks passed
@samuelcolvin samuelcolvin deleted the allow-json-contains-filter branch September 18, 2024 05:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants