Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Respect insert_allow_materialized_columns for INSERT into Distributed() #23349

Conversation

azat
Copy link
Collaborator

@azat azat commented Apr 20, 2021

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Respect insert_allow_materialized_columns (allows materialized columns) for INSERT into Distributed table.

Follow-up for: #5429
Refs: #16897

@robot-clickhouse robot-clickhouse added the pr-improvement Pull request with some product improvements label Apr 20, 2021
@alexey-milovidov
Copy link
Member

Respect insert_allow_materialized_columns for INSERT into Distributed()

Also not a good enough changelog entry.
I will fix it.

@alexey-milovidov alexey-milovidov self-assigned this Apr 20, 2021
@azat azat force-pushed the dist-respect-insert_allow_materialized_columns branch from 7777c5f to f394431 Compare April 30, 2021 20:44
@azat
Copy link
Collaborator Author

azat commented Apr 30, 2021

Also not a good enough changelog entry.

@alexey-milovidov I've adjusted it a little

@alexey-milovidov
Copy link
Member

@Mergifyio update

@alexey-milovidov
Copy link
Member

@azat Do you think, maybe Distributed table should always insert into materialized columns?

@mergify
Copy link
Contributor

mergify bot commented May 10, 2021

Command update: success

Branch has been successfully updated

@azat
Copy link
Collaborator Author

azat commented May 10, 2021

@azat Do you think, maybe Distributed table should always insert into materialized columns?

This an option, but why? Especially if there is insert_allow_materialized_columns setting.
Plus underlying table will still require this setting, and the INSERT will fail

@alexey-milovidov
Copy link
Member

The user only see the structure of Distributed table - if it can successfully insert data into it, materialized columns in remote table should not prevent sending the data.

@azat
Copy link
Collaborator Author

azat commented May 12, 2021

materialized columns in remote table should not prevent sending the data.

So what you are suggesting?
Always avoid sending them into underlying table? (like w/o this PR)
Or force insert_allow_materialized_columns for INSERTs to the underlying table of Distributed?

@azat
Copy link
Collaborator Author

azat commented May 15, 2021

Functional stateless tests (release, DatabaseReplicated) — fail: 1, passed: 2921, skipped: 108
Integration tests (thread) — Timeout, fail: 1, passed: 1053, flaky: 0
PVS check — Have 2 new errors, 22 total errors
SQLancer test — SQLancer test run. See report

Unrelated.

Style Check — Style check failed.

Unrelated (it reports an issue in a file that this PR does not touches - PocoHTTPClient.cpp)

Yandex third-party checks (only for Yandex employees)

Does not looks related either.

@azat azat force-pushed the dist-respect-insert_allow_materialized_columns branch from 7c0c74f to 4d737a5 Compare May 20, 2021 04:40
@azat
Copy link
Collaborator Author

azat commented May 30, 2021

maybe Distributed table should always insert into materialized columns?

I have the following use case, I have something like distributed dictionary (since overall dictionary is quite large and will not fit into memory of one node).
To materialize values, I'm using Distributed table to deliver data into the shards, each shard contain part of that dictionary, and the column is materialized on the shard correctly (via Buffer engine with MATERIALIZED column).
However if Distributed table will materialize this column it will be incorrect.

Materialized columns can be removed from Distributed table, but then you cannot query them via Distributed table.
Yes you can do this via separate Distributed table, but looks a little bit hacky.

@alexey-milovidov alexey-milovidov merged commit 34d1206 into ClickHouse:master Jun 14, 2021
@azat azat deleted the dist-respect-insert_allow_materialized_columns branch June 14, 2021 07:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-improvement Pull request with some product improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants