Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

select distinct produces unexpected result #53726

Closed
r33s3n6 opened this issue May 31, 2024 · 5 comments · Fixed by #54067
Closed

select distinct produces unexpected result #53726

r33s3n6 opened this issue May 31, 2024 · 5 comments · Fixed by #54067

Comments

@r33s3n6
Copy link

r33s3n6 commented May 31, 2024

1. Minimal reproduce step (Required)

create table t1 (c1 int primary key);
insert into t1 (c1) values (575932053), (-258025139);

SELECT DISTINCT
  cast(c1 as decimal) as c3,
  cast(c1 as signed) as c4
FROM t1;

2. What did you expect to see? (Required)

The two columns should be same, in MySQL:

mysql> SELECT DISTINCT 
    ->   cast(c1 as decimal) as c3,
    ->   cast(c1 as signed) as c4
    -> FROM t1;
+------------+------------+
| c3         | c4         |
+------------+------------+
| -258025139 | -258025139 |
|  575932053 |  575932053 |
+------------+------------+
2 rows in set (0.00 sec)

3. What did you see instead (Required)

But TiDB outputs strange number.

mysql> SELECT DISTINCT 
    ->   cast(c1 as decimal) as c3,
    ->   cast(c1 as signed) as c4
    -> FROM t1;
+------------+---------------------+
| c3         | c4                  |
+------------+---------------------+
| -258025139 | 1108209533567631369 |
|  575932053 |                   0 |
+------------+---------------------+
2 rows in set (0.01 sec)

After removing distinct, the result is correct:

mysql> SELECT
    ->   cast(c1 as decimal) as c3,
    ->   cast(c1 as signed) as c4
    -> FROM t1;
+------------+------------+
| c3         | c4         |
+------------+------------+
| -258025139 | -258025139 |
|  575932053 |  575932053 |
+------------+------------+
2 rows in set (0.00 sec)

4. What is your TiDB version? (Required)

Release Version: v8.1.0
Edition: Community
Git Commit Hash: 945d07c5d5c7a1ae212f6013adfb187f2de24b23
Git Branch: HEAD
UTC Build Time: 2024-05-21 03:51:57
GoVersion: go1.21.10
Race Enabled: false
Check Table Before Drop: false
Store: tikv

topology:

distributed.yaml:

global:
  user: "tidb"
  ssh_port: 22
  deploy_dir: "/tidb-deploy"
  data_dir: "/tidb-data"

pd_servers:
  - host: 10.0.2.31

tidb_servers:
  - host: 10.0.2.21

tikv_servers:
  - host: 10.0.2.11
  - host: 10.0.2.12
  - host: 10.0.2.13

monitoring_servers:
  - host: 10.0.2.8

grafana_servers:
  - host: 10.0.2.8

alertmanager_servers:
  - host: 10.0.2.8

tiflash_servers:
  - host: 10.0.2.32

about us

We are the BASS team from the School of Cyber Science and Technology at Beihang University. Our main focus is on system software security, operating systems, and program analysis research, as well as the development of automated program testing frameworks for detecting software defects. Using our self-developed database vulnerability testing tool, we have identified the above-mentioned vulnerabilities in TiDB that may lead to database logic error.

@r33s3n6 r33s3n6 added the type/bug The issue is confirmed as a bug. label May 31, 2024
@hawkingrei hawkingrei self-assigned this Jun 3, 2024
@hawkingrei hawkingrei added sig/execution SIG execution and removed sig/planner SIG: Planner labels Jun 13, 2024
@zanmato1984
Copy link
Contributor

This reproduction has some randomness. Here's what I found.

If you execute the following statements real quick (e.g. you can execute them all at once using single statement or using scripts):

tidb> create table t7(c int); insert into t7 values (575932053), (-258025139); select distinct cast(c as decimal), cast(c as signed) from t7;
Query OK, 0 rows affected (0.11 sec)

Query OK, 2 rows affected (0.01 sec)
Records: 2  Duplicates: 0  Warnings: 0

+--------------------+---------------------+
| cast(c as decimal) | cast(c as signed)   |
+--------------------+---------------------+
|          575932053 |                   0 |
|         -258025139 | 1108209533567631369 |
+--------------------+---------------------+
2 rows in set (0.00 sec)

You get the wrong result. However if you execute the query again a while later, you get the right result:

tidb> select distinct cast(c as decimal), cast(c as signed) from t7;
+--------------------+-------------------+
| cast(c as decimal) | cast(c as signed) |
+--------------------+-------------------+
|         -258025139 |        -258025139 |
|          575932053 |         575932053 |
+--------------------+-------------------+
2 rows in set (0.00 sec)

@zanmato1984
Copy link
Contributor

Now let's capture the plan for the wrong result:

tidb> create table t8(c int); insert into t8 values (575932053), (-258025139); explain select distinct cast(c as decimal), cast(c as signed) from t8; select distinct cast(c as decimal), cast(c as signed) from t8;
Query OK, 0 rows affected (0.11 sec)

Query OK, 2 rows affected (0.01 sec)
Records: 2  Duplicates: 0  Warnings: 0

+---------------------------+----------+-----------+---------------+-----------------------------------------------------------------------------------------------------+
| id                        | estRows  | task      | access object | operator info                                                                                       |
+---------------------------+----------+-----------+---------------+-----------------------------------------------------------------------------------------------------+
| HashAgg_8                 | 8000.00  | root      |               | group by:Column#7, Column#8, funcs:firstrow(Column#7)->Column#3, funcs:firstrow(Column#7)->Column#4 |
| └─TableReader_9           | 8000.00  | root      |               | data:HashAgg_4                                                                                      |
|   └─HashAgg_4             | 8000.00  | cop[tikv] |               | group by:cast(test.t8.c, bigint(22) BINARY), cast(test.t8.c, decimal(10,0) BINARY),                 |
|     └─TableFullScan_7     | 10000.00 | cop[tikv] | table:t8      | keep order:false, stats:pseudo                                                                      |
+---------------------------+----------+-----------+---------------+-----------------------------------------------------------------------------------------------------+
4 rows in set (0.00 sec)

+--------------------+---------------------+
| cast(c as decimal) | cast(c as signed)   |
+--------------------+---------------------+
|         -258025139 | 1108209533567631369 |
|          575932053 |                   0 |
+--------------------+---------------------+
2 rows in set (0.00 sec)

This part funcs:firstrow(Column#7)->Column#3, funcs:firstrow(Column#7)->Column#4 is clearly wrong. It's firstrowing Column#7 twice.

Meanwhile the correct plan:

tidb> explain select distinct cast(c as decimal), cast(c as signed) from t8; select distinct cast(c as decimal), cast(c as signed) from t8;
+----------------------------+---------+-----------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id                         | estRows | task      | access object | operator info                                                                                                                                                                                    |
+----------------------------+---------+-----------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| HashAgg_6                  | 1.60    | root      |               | group by:Column#13, Column#14, funcs:firstrow(Column#11)->Column#3, funcs:firstrow(Column#12)->Column#4                                                                                          |
| └─Projection_12            | 2.00    | root      |               | cast(test.t8.c, decimal(10,0) BINARY)->Column#11, cast(test.t8.c, bigint(22) BINARY)->Column#12, cast(test.t8.c, decimal(10,0) BINARY)->Column#13, cast(test.t8.c, bigint(22) BINARY)->Column#14 |
|   └─TableReader_11         | 2.00    | root      |               | data:TableFullScan_10                                                                                                                                                                            |
|     └─TableFullScan_10     | 2.00    | cop[tikv] | table:t8      | keep order:false, stats:pseudo                                                                                                                                                                   |
+----------------------------+---------+-----------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4 rows in set (0.01 sec)

+--------------------+-------------------+
| cast(c as decimal) | cast(c as signed) |
+--------------------+-------------------+
|         -258025139 |        -258025139 |
|          575932053 |         575932053 |
+--------------------+-------------------+
2 rows in set (0.00 sec)

@zanmato1984 zanmato1984 added sig/planner SIG: Planner and removed sig/execution SIG execution labels Jun 14, 2024
@zanmato1984
Copy link
Contributor

I have another question, maybe unrelated to this bug, about the correct plan. Why is this projection Projection_12 | 2.00 | root | | cast(test.t8.c, decimal(10,0) BINARY)->Column#11, cast(test.t8.c, bigint(22) BINARY)->Column#12, cast(test.t8.c, decimal(10,0) BINARY)->Column#13, cast(test.t8.c, bigint(22) BINARY)->Column#14 doing every expression twice - one for group by key and the other for agg function? Isn't this unnecessary duplication? @elsa0520 @fixdb

@hawkingrei
Copy link
Member

hawkingrei commented Jun 16, 2024

I have another question, maybe unrelated to this bug, about the correct plan. Why is this projection Projection_12 | 2.00 | root | | cast(test.t8.c, decimal(10,0) BINARY)->Column#11, cast(test.t8.c, bigint(22) BINARY)->Column#12, cast(test.t8.c, decimal(10,0) BINARY)->Column#13, cast(test.t8.c, bigint(22) BINARY)->Column#14 doing every expression twice - one for group by key and the other for agg function? Isn't this unnecessary duplication? @elsa0520 @fixdb

I am investigating this problem. I find that if we rewrite the SQL, it will get more simple planner.

> explain select cast(c as decimal), cast(c as signed) from t7 group by c

+-------------------------+----------+-----------+---------------+-----------------------------------------------------------------------------------------------+
| id                      | estRows  | task      | access object | operator info                                                                                 |
+-------------------------+----------+-----------+---------------+-----------------------------------------------------------------------------------------------+
| Projection_4            | 8000.00  | root      |               | cast(test.t7.c, decimal(10,0) BINARY)->Column#3, cast(test.t7.c, bigint(22) BINARY)->Column#4 |
| └─HashAgg_9             | 8000.00  | root      |               | group by:test.t7.c, funcs:firstrow(test.t7.c)->test.t7.c                                      |
|   └─TableReader_10      | 8000.00  | root      |               | data:HashAgg_5                                                                                |
|     └─HashAgg_5         | 8000.00  | cop[tikv] |               | group by:test.t7.c,                                                                           |
|       └─TableFullScan_8 | 10000.00 | cop[tikv] | table:t7      | keep order:false, stats:pseudo                                                                |
+-------------------------+----------+-----------+---------------+-----------------------------------------------------------------------------------------------+
> explain  select  cast(c as decimal) as a,  cast(c as signed) as b  from t7 group by a,b

+-------------------------+----------+-----------+---------------+-------------------------------------------------------------------------------------------------------------------------+
| id                      | estRows  | task      | access object | operator info                                                                                                           |
+-------------------------+----------+-----------+---------------+-------------------------------------------------------------------------------------------------------------------------+
| Projection_4            | 8000.00  | root      |               | cast(test.t7.c, decimal(10,0) BINARY)->Column#3, cast(test.t7.c, bigint(22) BINARY)->Column#4                           |
| └─HashAgg_9             | 8000.00  | root      |               | group by:Column#7, Column#8, funcs:firstrow(Column#9)->test.t7.c                                                        |
|   └─TableReader_10      | 8000.00  | root      |               | data:HashAgg_5                                                                                                          |
|     └─HashAgg_5         | 8000.00  | cop[tikv] |               | group by:cast(test.t7.c, bigint(22) BINARY), cast(test.t7.c, decimal(10,0) BINARY), funcs:firstrow(test.t7.c)->Column#9 |
|       └─TableFullScan_8 | 10000.00 | cop[tikv] | table:t7      | keep order:false, stats:pseudo                                                                                          |
+-------------------------+----------+-----------+---------------+-------------------------------------------------------------------------------------------------------------------------+

(END)



@hawkingrei
Copy link
Member

I have another question, maybe unrelated to this bug, about the correct plan. Why is this projection Projection_12 | 2.00 | root | | cast(test.t8.c, decimal(10,0) BINARY)->Column#11, cast(test.t8.c, bigint(22) BINARY)->Column#12, cast(test.t8.c, decimal(10,0) BINARY)->Column#13, cast(test.t8.c, bigint(22) BINARY)->Column#14 doing every expression twice - one for group by key and the other for agg function? Isn't this unnecessary duplication? @elsa0520 @fixdb

I have fixed it in the #54163. Some same projections forget to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants