Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

import into from select #49883

Closed
7 tasks done
D3Hunter opened this issue Dec 28, 2023 · 7 comments
Closed
7 tasks done

import into from select #49883

D3Hunter opened this issue Dec 28, 2023 · 7 comments
Labels
type/feature-request Categorizes issue or PR as related to a new feature.

Comments

@D3Hunter
Copy link
Contributor

D3Hunter commented Dec 28, 2023

Feature Request

Is your feature request related to a problem? Please describe:

“Insert into from select” statement do normal insert in one txn, the performance can not satisfy user expection(see #48637 too), so we add a similar IMPORT into from select which uses physical import same as lightning to speed up.
Describe the feature you'd like:

IMPORT into doesn't support transaction, and the target table must be a empty one. Selected rows will be encoded into kv, sorted and write to local disk, data will be ingest into tikv directly, just as what lightning physical mode does.

during prototype test, it will speed up 2~8 times compared to INSERT INTO depends on scenario and configured concurrency.

It also support import from snapshot, but only using as of timestamp 'xxx' inside the select-statement is supported, set tidb_snapshot is not.

the grammar:

IMPORT INTO table[column, column...] FROM select-statement [WITH thread=xx]

EXPLAIN import-into-statement

tasks:

Describe alternatives you've considered:

Teachability, Documentation, Adoption, Migration Strategy:

@AndrewDi
Copy link
Contributor

Could SELECT Statement support tidb_snapshot?

@D3Hunter
Copy link
Contributor Author

Could SELECT Statement support tidb_snapshot?

not now, can you describe your usage scenario?

@AndrewDi
Copy link
Contributor

Could SELECT Statement support tidb_snapshot?

not now, can you describe your usage scenario?

We need to replicate a table with specific timestamp to a new table

@D3Hunter
Copy link
Contributor Author

D3Hunter commented Jan 29, 2024

Could SELECT Statement support tidb_snapshot?

not now, can you describe your usage scenario?

We need to replicate a table with specific timestamp to a new table

how frequent you do it?

and is it whole table replicate, i.e. select * from source_tbl?

what's the new table used for?

  • as a backup? why not use backup&restore tool?
  • as a persisted snapshot (avoid gc) that can be queried for analysis later?

@AndrewDi
Copy link
Contributor

Could SELECT Statement support tidb_snapshot?

not now, can you describe your usage scenario?

We need to replicate a table with specific timestamp to a new table

how frequent you do it?

and is it whole table replicate, i.e. select * from source_tbl?

what's the new table used for?

  • as a backup? why not use backup&restore tool?
  • as a persisted snapshot (avoid gc) that can be queried for analysis later?

maybe everyday, we need to update this tables, so read only snapshot may not meet our requirement.

@D3Hunter
Copy link
Contributor Author

D3Hunter commented Feb 2, 2024

Could SELECT Statement support tidb_snapshot?

you can try using nightly build, with import into dst from select * from src as of timestamp '2024-01-31 16:22:33.516399'; now

@D3Hunter
Copy link
Contributor Author

D3Hunter commented Apr 2, 2024

released in 8.0

@D3Hunter D3Hunter closed this as completed Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature-request Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

2 participants