Replies: 2 comments 1 reply
-
Hi @tdcmeehan, in short I would say yes that is the longer term goal. In the shorter term, we can also consider developing a way to have a long running process that listens for updates to the metadata folders and then calls the sync when these updates are detected. Do you have any spots that you are considering integrating OneTable right now? |
Beta Was this translation helpful? Give feedback.
-
Thanks, makes sense. I am trying to understand this from the perspective of a multitenant data lake. Suppose we have a multitenant data lake with streaming ingest via Flink from logs, Spark for batch ingest, and some Presto for cheaper small batch ingest, it sounds like the long term idea in this scenario is that each of these engines would independently attempt to sync once commits are complete. |
Beta Was this translation helpful? Give feedback.
-
I heard from the OneTable - Introduction and Demo event that sync is considered to be a library, and there is an initial integration with Hudi's Delta Streamer.
Is the idea that anywhere there is ingestion in general, ideally there would be a hook to use OneTable to sync the metadata immediately after commit? So for example, integration into various Presto and Trino table format connectors, Spark, etc?
Beta Was this translation helpful? Give feedback.
All reactions