Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize DDL for CHAR(10) to VARCHAR(100) #40574

Open
dveeden opened this issue Jan 13, 2023 · 3 comments
Open

Optimize DDL for CHAR(10) to VARCHAR(100) #40574

dveeden opened this issue Jan 13, 2023 · 3 comments
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. type/enhancement The issue or PR belongs to an enhancement.

Comments

@dveeden
Copy link
Contributor

dveeden commented Jan 13, 2023

Feature Request

Changing a column from a CHAR(n) to a VARCHAR(m) is now going through a full reorg. However the CHAR is guaranteed to fit in the VARCHAR as long as the charset remains the same and if m>=n.

Here the source is a CHAR and the target is a VARCHAR, but this is likely also true for any other combination of these types and might also be true for VARBINARY etc.

The improvement would be to do this as a metadata only change and only change the actual rows when they are written again.

@mjonss
Copy link
Contributor

mjonss commented Jan 17, 2023

It looks like the secondary indexes have different format for char and varchar (according to comment here).
We could still do this as metadata change if the column is not included in any index.

@bb7133 bb7133 added type/enhancement The issue or PR belongs to an enhancement. and removed type/feature-request Categorizes issue or PR as related to a new feature. labels Jan 17, 2023
@bb7133
Copy link
Member

bb7133 commented Jan 17, 2023

Thanks, @mjonss

Even for columns without a secondary index, we cannot convert VARCHAR to CHAR without data reorganization because for TiDB, the data in CHAR is potentially truncated by removing all trailing spaces.

But for CHAR to VARCHAR without any index, I think this can be optimized. /cc @zimulala @Benjamin2037

@bb7133 bb7133 added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Jan 17, 2023
@yiwen92
Copy link
Contributor

yiwen92 commented Jan 18, 2023

It looks like the secondary indexes have different format for char and varchar (according to comment here). We could still do this as metadata change if the column is not included in any index.

What is the exactly difference between char and varchar for index encoding?
// 2. char -> varchar: the index value encoding of secondary index on clustered primary key tables is different. // These secondary indexes need to be rewritten.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

No branches or pull requests

4 participants