Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adapter: support for dynamic unions #13743

Open
benesch opened this issue Jul 19, 2022 · 2 comments
Open

adapter: support for dynamic unions #13743

benesch opened this issue Jul 19, 2022 · 2 comments
Labels
A-ADAPTER Topics related to the ADAPTER layer C-feature Category: new feature or request

Comments

@benesch
Copy link
Member

benesch commented Jul 19, 2022

Feature request

Suppose you have several sources or views in Materialize with the same schema. Perhaps you have a source per customer, for example:

CREATE SOURCE customer_1 FROM KAFKA ... TOPIC 'customer1' ...;
CREATE SOURCE customer_2 FROM KAFKA ... TOPIC 'customer2' ...;
CREATE SOURCE customer_3 FROM KAFKA ... TOPIC 'customer3' ...;

You might like to create a view that unions together all these schemas:

CREATE VIEW customers AS
    SELECT * FROM customer_1
    UNION ALL
    SELECT * FROM customer_2
    UNION ALL
    SELECT * FROM customer_3;

Now, what if you want to add a new customer? In this example it's easy to update the view...

CREATE OR REPLACE VIEW customers AS
    SELECT * FROM customer_1
    UNION ALL
    SELECT * FROM customer_2
    UNION ALL
    SELECT * FROM customer_3
    UNION ALL
    SELECT * FROM customer_4;

But that won't work if you've got views or sinks downstream of the customers view.

Here's a proposal for how we could fix this, off the top of my head. We introduce a new "union" catalog object into which views or sources with identical schemas can be added and removed over time:

CREATE UNION customers FROM customer_1, customer_2, customer_3;
ALTER UNION customers ADD customer_4;
ALTER UNION customers DROP customer_1;

Internally this would look a lot like a rendition. We'd need a rendition shard that points at the underlying data shards; as views are added and removed to the union, the rendition shard would update its pointers to the underlying data shards.

We've had requests for this both internally and externally over the years. Here are the lates:

I'm sure @frankmcsherry has some thoughts here too.

cc @ahelium @sjwiesman @andy

@benesch benesch added the C-feature Category: new feature or request label Jul 19, 2022
@andrioni
Copy link
Contributor

If we are going to support unions like these, aren't we going to be close to supporting more complex ALTER VIEW statements that don't change the schema? How much do we gain from special casing UNION?

@benesch
Copy link
Member Author

benesch commented Jul 21, 2022

If we are going to support unions like these, aren't we going to be close to supporting more complex ALTER VIEW statements that don't change the schema? How much do we gain from special casing UNION?

It's a good question. I'm really not sure! It depends how renditions turn out. I think there's a chance that support just unions is meaningfully easier from supporting arbitrary schema-preserving definition changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ADAPTER Topics related to the ADAPTER layer C-feature Category: new feature or request
Projects
None yet
Development

No branches or pull requests

3 participants