You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Background
We aim to support backend and server-less write support for iceberg tables. We'd like to do that in similar way we do it to delta-tables: make table_formaticeberg to be recognized by the filesystem destination. From the user PoV this means:
writing and reading iceberg tables without query engine as a separate backend
maintaining and evolving the schema without catalog as a separate backend
We want to use pyiceberg. This limits the write disposition to append and replace (until upsert is implemented). We also wont' support vacuum, compact or z-order ops on the tables.
Tasks
we maintain a "technical" catalog: sqllite file per table. those files we store together with the data
to write a table we lock the sqllite file with TransactionalFile, pull it locally, use with pyiceberg and then write it back.
use pyiceberg to append, replace tables, create partitions, do schema evolution etc.
support all buckets via fsspec
like for delta, expose pyiceberg for a given table. read only (catalog without lock) and r/w with lock on catalog (maybe via context manager). this will allow people ie. to delete or rebuild partitions on a table.
support filesystemsql_client to create views on ICEBERG via duckdb
The text was updated successfully, but these errors were encountered:
perhaps we can use an in-memory SQLite database instead of persisting the file to disk
if I understand correctly, at its core the catalog is only mapping table name to table metadata (which lives on the filesystem)—we can populate the in-memory SQLite database with this mapping based on dlt metadata
Background
We aim to support backend and server-less write support for iceberg tables. We'd like to do that in similar way we do it to delta-tables: make
table_format
iceberg
to be recognized by the filesystem destination. From the user PoV this means:We want to use pyiceberg. This limits the write disposition to append and replace (until upsert is implemented). We also wont' support vacuum, compact or z-order ops on the tables.
Tasks
TransactionalFile
, pull it locally, use with pyiceberg and then write it back.pyiceberg
to append, replace tables, create partitions, do schema evolution etc.filesystem
sql_client
to create views on ICEBERG via duckdbThe text was updated successfully, but these errors were encountered: