forked from pola-rs/polars
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fix: correctly convert nullability of nested parquet fields to arrow
The `parquet_to_arrow_schema` function tries to set the `nullable` flag of `Field` according to the parquet repetition levels. That flag is then used via the `InitNested` `enum` to calculate the level at which data is valid. In the following parquet message type, neither the list nor its elements should be marked as `nullable`, the definition level for the `REPEATED` group only indicates an empty list but not any nulls. ``` message eventlog { REQUIRED group events (LIST) { REPEATED group array { REQUIRED BYTE_ARRAY event_name (STRING); REQUIRED INT64 event_time (TIMESTAMP(MILLIS,true)); } } } ``` The nullability of fields was asserted in multiple tests, but those assertions did not match the comments. This is a port of jorgecarleitao/arrow2#1565 to the nano-arrow crate.
- Loading branch information
1 parent
05d9eb2
commit 980dcf3
Showing
1 changed file
with
57 additions
and
15 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters