Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Column alias is not propagated to subquery #42732

Closed
wolf31o2 opened this issue Mar 31, 2023 · 3 comments · Fixed by #45003
Closed

Column alias is not propagated to subquery #42732

wolf31o2 opened this issue Mar 31, 2023 · 3 comments · Fixed by #45003

Comments

@wolf31o2
Copy link

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

This is related to #26945 which was resolved in #33640 but is about column aliases.

DROP TABLE IF EXISTS t1, t2;
CREATE TABLE t1 (a INT, b INT);
CREATE TABLE t2 (a INT, b INT);
INSERT INTO t1 VALUES (1, 1);
INSERT INTO t2 VALUES (1, 1);

-- does not work (but works in mysql, and is similar to what hibernate generates)
SELECT one.a, one.b as b2 FROM t1 one ORDER BY (SELECT two.b FROM t2 two WHERE two.a = one.b);

2. What did you expect to see? (Required)

+------+
| a    |
+------+
|    1 |
+------+
1 row in set (0.00 sec)

3. What did you see instead (Required)

ERROR 1054 (42S22): Unknown column 'one.b' in 'where clause'

4. What is your TiDB version? (Required)

Release Version: v6.1.2
Edition: Community
Git Commit Hash: 5d2030e1d19629b71811c3f14514bad7ed63261a
Git Branch: heads/refs/tags/v6.1.2
UTC Build Time: 2022-10-18 08:04:48
GoVersion: go1.18.5
Race Enabled: false
TiKV Min Version: v3.0.0-60965b006877ca7234adaced7890d7b029ed1306
Check Table Before Drop: false

(Also, tested locally against 6.5.0 and the issue still persisted)

@wolf31o2 wolf31o2 added the type/bug The issue is confirmed as a bug. label Mar 31, 2023
@ti-chi-bot ti-chi-bot added may-affects-4.0 This bug maybe affects 4.0.x versions. may-affects-5.0 This bug maybe affects 5.0.x versions. may-affects-5.1 This bug maybe affects 5.1.x versions. may-affects-5.2 This bug maybe affects 5.2.x versions. may-affects-5.3 This bug maybe affects 5.3.x versions. may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.1 may-affects-6.5 labels Apr 3, 2023
@ghazalfamilyusa
Copy link
Contributor

Seems correlated subquery is broken for the order by clause. A workaround is to put the order by in the select list and use it in the order by.

explain SELECT one.a, one.b as b2, (SELECT two.b FROM t2 two WHERE two.a = one.b) order_field FROM t1 one order by order_field;
+----------------------------------+---------+-----------+---------------+---------------------------------+
| id | estRows | task | access object | operator info |
+----------------------------------+---------+-----------+---------------+---------------------------------+
| Sort_10 | 1.00 | root | | test.t2.b |
| └─Projection_12 | 1.00 | root | | test.t1.a, test.t1.b, test.t2.b |
| └─Apply_14 | 1.00 | root | | CARTESIAN left outer join |
| ├─TableReader_16(Build) | 1.00 | root | | data:TableFullScan_15 |
| │ └─TableFullScan_15 | 1.00 | cop[tikv] | table:one | keep order:false, stats:pseudo |
| └─MaxOneRow_17(Probe) | 1.00 | root | | |
| └─TableReader_20 | 0.00 | root | | data:Selection_19 |
| └─Selection_19 | 0.00 | cop[tikv] | | eq(test.t2.a, test.t1.b) |
| └─TableFullScan_18 | 1.00 | cop[tikv] | table:two | keep order:false, stats:pseudo |
+----------------------------------+---------+-----------+---------------+---------------------------------+

@winoros
Copy link
Member

winoros commented Jun 8, 2023

We split the column references into three parts:

  • The original column names from the table reference: col_set1
  • The original column names projected by projection: col_set2
  • The column aliases generated by projection: col_set3.

e.g. Table has three columns(a, b, c). And there's the SQL select a, b as bb from t ....

The where clause can only see the col_set1. (can see a, b, c)
The order by/having clause can see the col_set1+col_set2+col_set3. (can see a, b, c, and bb).
But the order by/having clause is entered after we build the where clause. So at that time, we'd already lost the information on col_set2 and col_set3.

We need to refactor the resolving logic of TiDB to solve the issue.

@winoros
Copy link
Member

winoros commented Jun 16, 2023

Currently, TiDB translates the AST to a logical plan by the following steps:

  • build the table references clause
  • build the where clause
  • build the order by clause
  • build the final projection(select fields) clause

When we build the where clause, we can only see the column set 1. So the columns will be projected to this scope to ensure the where clause will not see columns that are not visible to it.

But we will begin to build the order by clause after building the where clause. At this time, the column set 2 and 3, which is not visible before now, become visible. We must add them back and ensure the whole plan tree is valid. This is the problem we need to solve.

The previous pull #33640 tried one way but met this issue (and some other issues found by creating cases). Indicating that we need a more well-designed solution to solve the AST resolving issue.
/cc @wolf31o2 @fixdb

@winoros winoros added affects-6.1 affects-6.5 affects-7.2 and removed may-affects-4.0 This bug maybe affects 4.0.x versions. may-affects-5.1 This bug maybe affects 5.1.x versions. may-affects-5.2 This bug maybe affects 5.2.x versions. may-affects-5.3 This bug maybe affects 5.3.x versions. may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-5.0 This bug maybe affects 5.0.x versions. may-affects-6.1 may-affects-6.5 labels Jun 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants