Calling load_table().scan().to_arrow() emits error on empty "names" field in name mapping #925

spock-abadai · 2024-07-13T19:12:04Z

Apache Iceberg version

main (development)

Please describe the bug 🐞

On one of my iceberg tables, when I load a table and scan it, during the parsing of the name mapping in the table properties, pydantic issues the following ValidationError:

    def parse_mapping_from_json(mapping: str) -> NameMapping:
>       return NameMapping.model_validate_json(mapping)
E       pydantic_core._pydantic_core.ValidationError: 1 validation error for NameMapping
E       9.names
E         Value error, At least one mapped name must be provided for the field [type=value_error, input_value=[], input_type=list]
E           For further information visit https://errors.pydantic.dev/2.8/v/value_error

This seems to be a result of the code in table/name_mapping.py in the method check_at_least_one, which (if I understand correctly) checks that all fields in the name mapping have at least one name. However, if I'm reading the Iceberg spec correctly, it states that:

I'm not 100% sure what scenario lead to this but I can say that the name mapping we have indeed has a field with id 10 that has an empty list of names. This field existed at one point in the schema but it seems like it was removed. In any case, it doesn't seem like requiring that the list of names contain at least one value is in line with the spec (and it seems that situations where this isn't the case do happen).

Note that the said iceberg table was never created, written to or modified using pyiceberg (only using spark and trino). pyiceberg is only used to read.

The text was updated successfully, but these errors were encountered:

Fokko · 2024-07-13T20:34:45Z

@spock-abadai Thanks for reporting this. I agree that this seems to be incorrect. Are you interested in providing a PR?

spock-abadai · 2024-07-13T21:37:21Z

@spock-abadai Thanks for reporting this. I agree that this seems to be incorrect. Are you interested in providing a PR?

Sure, see #927 for review.

Fokko added this to the PyIceberg 0.7.0 release milestone Jul 13, 2024

spock-abadai mentioned this issue Jul 13, 2024

Allow empty names in mapped field of Name Mapping #927

Merged

Fokko closed this as completed in #927 Jul 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Calling load_table().scan().to_arrow() emits error on empty "names" field in name mapping #925

Calling load_table().scan().to_arrow() emits error on empty "names" field in name mapping #925

spock-abadai commented Jul 13, 2024

Fokko commented Jul 13, 2024

spock-abadai commented Jul 13, 2024

Calling load_table().scan().to_arrow() emits error on empty "names" field in name mapping #925

Calling load_table().scan().to_arrow() emits error on empty "names" field in name mapping #925

Comments

spock-abadai commented Jul 13, 2024

Apache Iceberg version

Please describe the bug 🐞

Fokko commented Jul 13, 2024

spock-abadai commented Jul 13, 2024