NotImplementedError: Parquet writer option(s) ['write.parquet.row-group-size-bytes'] not implemented #1013

djouallah · 2024-08-07T04:59:04Z

Apache Iceberg version

0.7.0 (latest release)

Please describe the bug 🐞

it was working fine, and today, I got this ? using Tabular as a catalog

/usr/local/lib/python3.10/dist-packages/pydantic/main.py:415: UserWarning: Pydantic serializer warnings:
  Expected `TableIdentifier` but got `dict` - serialized value may not be as expected
  return self.__pydantic_serializer__.to_json(
---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<timed exec> in <module>

[/usr/local/lib/python3.10/dist-packages/pyiceberg/table/__init__.py](https://localhost:8080/#) in append(self, df, snapshot_properties)
   1572         """
   1573         with self.transaction() as tx:
-> 1574             tx.append(df=df, snapshot_properties=snapshot_properties)
   1575 
   1576     def overwrite(

3 frames
[/usr/local/lib/python3.10/dist-packages/pyiceberg/io/pyarrow.py](https://localhost:8080/#) in _get_parquet_writer_kwargs(table_properties)
   2288     ]:
   2289         if unsupported_keys := fnmatch.filter(table_properties, key_pattern):
-> 2290             raise NotImplementedError(f"Parquet writer option(s) {unsupported_keys} not implemented")
   2291 
   2292     compression_codec = table_properties.get(TableProperties.PARQUET_COMPRESSION, TableProperties.PARQUET_COMPRESSION_DEFAULT)

NotImplementedError: Parquet writer option(s) ['write.parquet.row-group-size-bytes'] not implemented

The text was updated successfully, but these errors were encountered:

Fokko · 2024-08-07T14:49:05Z

@djouallah Thanks for raising this. For context, there was a bug where it would pass down the write.parquet.row-group-size-bytes, but it actually only allows passing down the number of records in a row group. Let me dig into this.

Fokko · 2024-08-07T15:59:11Z

Sorry for the inconvenience here. I've created a fix that we'll backport to the 0.7.1 branch

djouallah · 2024-08-10T09:08:46Z

same error with 0.7.1 rc1 ?

sungwy · 2024-08-10T15:09:36Z

Hi @djouallah - could you try using the property write.parquet.row-group-limit instead? Unfortunately write.parquet.row-group-size-bytes isn't a supported property in PyIceberg:

iceberg-python/pyiceberg/io/pyarrow.py

Lines 2346 to 2352 in d8b5c17

    
           for key_pattern in [ 
        
               TableProperties.PARQUET_ROW_GROUP_SIZE_BYTES, 
        
               TableProperties.PARQUET_BLOOM_FILTER_MAX_BYTES, 
        
               f"{TableProperties.PARQUET_BLOOM_FILTER_COLUMN_ENABLED_PREFIX}.*", 
        
           ]: 
        
               if unsupported_keys := fnmatch.filter(table_properties, key_pattern): 
        
                   raise NotImplementedError(f"Parquet writer option(s) {unsupported_keys} not implemented")

djouallah · 2024-08-10T23:30:55Z

ah, I see , thank you , it was the catalog for some reason who added all those properties, all good

Fokko mentioned this issue Aug 7, 2024

Allow setting write.parquet.row-group-limit #1016

Merged

sungwy closed this as completed in #1016 Aug 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NotImplementedError: Parquet writer option(s) ['write.parquet.row-group-size-bytes'] not implemented #1013

NotImplementedError: Parquet writer option(s) ['write.parquet.row-group-size-bytes'] not implemented #1013

djouallah commented Aug 7, 2024

Fokko commented Aug 7, 2024

Fokko commented Aug 7, 2024

djouallah commented Aug 10, 2024

sungwy commented Aug 10, 2024

djouallah commented Aug 10, 2024

NotImplementedError: Parquet writer option(s) ['write.parquet.row-group-size-bytes'] not implemented #1013

NotImplementedError: Parquet writer option(s) ['write.parquet.row-group-size-bytes'] not implemented #1013

Comments

djouallah commented Aug 7, 2024

Apache Iceberg version

Please describe the bug 🐞

Fokko commented Aug 7, 2024

Fokko commented Aug 7, 2024

djouallah commented Aug 10, 2024

sungwy commented Aug 10, 2024

djouallah commented Aug 10, 2024