-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[POC] Internal Storage Sink #13236
[POC] Internal Storage Sink #13236
Conversation
4095ba9
to
4a27b38
Compare
src/compute/src/render/sinks.rs
Outdated
@@ -36,7 +37,7 @@ where | |||
tokens: &mut std::collections::BTreeMap<GlobalId, Rc<dyn std::any::Any>>, | |||
import_ids: BTreeSet<GlobalId>, | |||
sink_id: GlobalId, | |||
sink: &SinkDesc, | |||
sink: &SinkDesc<CollectionMetadata>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SinkDesc
get a "storage metadata type", akin to SourceInstanceDesc
. The coordinator deals exclusively with SinkDesc<()>
. The controller converts these into SinkDesc<CollectionMetadata>
before sending them to the compute instances, by asking the storage controller for the collection information.
src/coord/src/coord.rs
Outdated
let ingestion = IngestionDescription { | ||
id, | ||
desc, | ||
since: Antichain::from_elem(0), // TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably wrong too.
.collect(); | ||
self.storage_mut() | ||
.update_write_frontiers(&storage_updates) | ||
.await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This tells the storage controller about new uppers for collections targeted by sinks. @aljoscha has proposed that we should instead make the compute controller query the persist shards for their uppers directly.
In your example, after |
I'm not sure! Currently storage sinks don't show in the If you think about it, storage sinks are similar to tables in that you can both write to and read from them. Tables also don't show up in the |
remove PersistSource
This is a proof-of-concept for MaterializeInc/database-issues#3692. It demonstrates sinking data to a storage collection and reading back from it again.
The poc re-uses the existing "CREATE SINK ... INTO PERSIST" syntax. A real implementation would not do this, but the user interface of storage sinks is still under discussion.
Usage
Create some data to sink:
Create a storage sink:
CREATE SINK mysink FROM mytable INTO PERSIST;
Read back sinked data:
Notes
(Option<Row>, Option<Row>)
key-value pairs, while storage's persist_source reads a collection of(SourceData, ())
. Since both have to be compatible, the poc drops the key and unwraps the value in persist_sink,which is probably not the right thing to do.