Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong value for replicating enum from mysql to Tidb #51908

Open
2b1sh opened this issue Mar 19, 2024 · 5 comments
Open

Wrong value for replicating enum from mysql to Tidb #51908

2b1sh opened this issue Mar 19, 2024 · 5 comments
Labels
compatibility-mysql8 This is a compatibility issue with MySQL 8.0(but NOT 5.7) type/enhancement The issue or PR belongs to an enhancement.

Comments

@2b1sh
Copy link

2b1sh commented Mar 19, 2024

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

Step 1 : Created the table in MySQL

mysql> show create table user\G
*************************** 1. row ***************************
       Table: user
Create Table: CREATE TABLE `user` (
  `id` int NOT NULL AUTO_INCREMENT,
  `name` varchar(20) DEFAULT NULL,
  `age` int DEFAULT NULL,
  `gender` enum('male','female') DEFAULT NULL,
  `user_type` enum('AQ','AL') DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=81 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)

Step 2 : Inserted values into the same table

mysql> INSERT INTO user (name, age, gender, user_type) VALUES     ('John', 25, 'male', 'AQ'),     ('Alice', 30, 'female', 'AL');
Query OK, 2 rows affected (0.01 sec)
Records: 2  Duplicates: 0  Warnings: 0

Step 3 : Migrated this table to TiDB using TiDB DM and checked it

Following is the output in TiDB

mysql> select * from user;
+----+-------+------+--------+-----------+
| id | name  | age  | gender | user_type |
+----+-------+------+--------+-----------+
|  1 | John  |   25 | male   | AQ        |
|  2 | Alice |   30 | female | AL        |
+----+-------+------+--------+-----------+
2 rows in set (0.00 sec)

Step 4 : Then I changed the datatype from enum to varchar(2) in TiDB
Because the ENUM is a experimental feature for TiFLash

mysql> alter table user modify column user_type Varchar(2);
Query OK, 0 rows affected (0.36 sec)
mysql> alter table user modify column gender Varchar(2);
Query OK, 0 rows affected (0.36 sec)

Step 5 : Again inserted data in MySQL

mysql> INSERT INTO user (name, age, gender, user_type) VALUES ('Emma', 28, 'female', 'AL'),     ('Mike', 40, 'male', 'AQ');
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

Step 6 : Again checked it in TiDB

This is what I get as the output

mysql> select * from user;
+----+-------+------+--------+-----------+
| id | name  | age  | gender | user_type |
+----+-------+------+--------+-----------+
|  1 | John  |   25 | male   | AQ        |
|  2 | Alice |   30 | female | AL        |
|  3 | Emma  |   28 | 2      | 2         |
|  4 | Mike  |   40 | 1      | 1         |
+----+-------+------+--------+-----------+
4 rows in set (0.00 sec)

2. What did you expect to see? (Required)

I expect the same data as in the MySQL

mysql> select * from user;
+----+-------+------+--------+-----------+
| id | name  | age  | gender | user_type |
+----+-------+------+--------+-----------+
|  1 | John  |   25 | male   | AQ        |
|  2 | Alice |   30 | female | AL        |
|  3 | Emma  |   28 | female | AL        |
|  4 | Mike  |   40 | male   | AQ        |
+----+-------+------+--------+-----------+
4 rows in set (0.00 sec)

3. What did you see instead (Required)

Instead I get the data i have changed to Varchar as 1 and 2.

mysql> select * from user;
+----+-------+------+--------+-----------+
| id | name  | age  | gender | user_type |
+----+-------+------+--------+-----------+
|  1 | John  |   25 | male   | AQ        |
|  2 | Alice |   30 | female | AL        |
|  3 | Emma  |   28 | 2      | 2         |
|  4 | Mike  |   40 | 1      | 1         |
+----+-------+------+--------+-----------+
4 rows in set (0.00 sec)

4. What is your TiDB version? (Required)

mysql> select tidb_version()\G
*************************** 1. row ***************************
tidb_version(): Release Version: v7.5.1
Edition: Community
Git Commit Hash: 7d16cc79e81bbf573124df3fd9351c26963f3e70
Git Branch: heads/refs/tags/v7.5.1
UTC Build Time: 2024-02-27 14:28:32
GoVersion: go1.21.6
Race Enabled: false
Check Table Before Drop: false
Store: tikv
1 row in set (0.00 sec)

@2b1sh 2b1sh added the type/bug The issue is confirmed as a bug. label Mar 19, 2024
@2b1sh 2b1sh changed the title Replicate Enum data MySQL to TiDB Wrong value for replicating enum from mysql to Tidb Mar 19, 2024
@dveeden
Copy link
Contributor

dveeden commented Mar 19, 2024

This is related to #51097

@lance6716
Copy link
Contributor

This is the expected behaviour. DM uses the numeric index of ENUM when replicating. So after the column type changed to VARCHAR, it begins to insert numbers. You can enable TiDB's general log to see the SQL DM uses.

And this is the repo for DM https://github.com/pingcap/tiflow/issues, you can open a feature request

@2b1sh
Copy link
Author

2b1sh commented Mar 20, 2024

Sure, I have raised the feature request #10824

@dveeden
Copy link
Contributor

dveeden commented Mar 20, 2024

I think there are multiple ways of solving this:

  1. Fix The Tiflash mode is not supporting for ENUM data type #51097 and have TiFlash deal with ENUMs
  2. Have DM send the ENUM values instead of IDs
  3. Find some workaround with VIEWs and/or generated columns
  4. Change this in the upstream MySQL to a VARCHAR or something similar
  5. Maybe change this in the upstream to a JSON datatype..

(this is just some brainstorming about this, not guaranteed to be a solution)

sql> CREATE TABLE t2 (id INT PRIMARY KEY, eid INT, e ENUM('foo','bar') AS (eid), et VARCHAR(255) AS (e));
Query OK, 0 rows affected (0.1597 sec)

sql> INSERT INTO t2 (id,eid) VALUES(1,1),(2,2);
Query OK, 2 rows affected (0.0117 sec)

Records: 2  Duplicates: 0  Warnings: 0

sql> TABLE t2;
+----+-----+-----+-----+
| id | eid | e   | et  |
+----+-----+-----+-----+
|  1 |   1 | foo | foo |
|  2 |   2 | bar | bar |
+----+-----+-----+-----+
2 rows in set (0.0008 sec)

sql> DESCRIBE t2;
+-------+-------------------+------+-----+---------+-------------------+
| Field | Type              | Null | Key | Default | Extra             |
+-------+-------------------+------+-----+---------+-------------------+
| id    | int(11)           | NO   | PRI | NULL    |                   |
| eid   | int(11)           | YES  |     | NULL    |                   |
| e     | enum('foo','bar') | YES  |     | NULL    | VIRTUAL GENERATED |
| et    | varchar(255)      | YES  |     | NULL    | VIRTUAL GENERATED |
+-------+-------------------+------+-----+---------+-------------------+
4 rows in set (0.0004 sec)

@seiya-annie seiya-annie added type/enhancement The issue or PR belongs to an enhancement. compatibility-mysql8 This is a compatibility issue with MySQL 8.0(but NOT 5.7) and removed type/bug The issue is confirmed as a bug. labels Mar 21, 2024
@s4sukan
Copy link

s4sukan commented Mar 27, 2024

Option - 4, 5 - Right now, upstream MySQL has lot of dependencies so it's not possible to change it to char or JSON

Option - 3 - This is solving the enum issue but again virtual columns are not supported by TiFlash at the moment :(

MPP mode may be blocked because column dbname.table_name.report_type_tidb is a virtual column which is not supported now. |

Kind of blocked here, only one option is viable, convert these columns to char in TiDB, put a cron to update the actual value instead of index value and schedule this every minute. Do we have any timelines or future scope in Option- 1 or 2 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compatibility-mysql8 This is a compatibility issue with MySQL 8.0(but NOT 5.7) type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

No branches or pull requests

5 participants