Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ast, parser: fix parse join #1129

Merged
merged 4 commits into from
Dec 21, 2020
Merged

ast, parser: fix parse join #1129

merged 4 commits into from
Dec 21, 2020

Conversation

wjhuang2016
Copy link
Member

Signed-off-by: wjhuang2016 [email protected]

What problem does this PR solve?

Fix pingcap/tidb#20563
We can't parse t1 join t2 join t3 on xxx on xxx.
Because the order is: (t1 join t2 join t3 on xxx) on xxx. What we need is t1 join (t2 join t3 on xxx) on xxx

What is changed and how it works?

  1. Make join straightJoin inner cross left right full natural below tableRefPriority, which is the same as MySQL.
  2. Implement NewCrossJoin(), here is its logic:
// NewCrossJoin builds a cross join without `on` or `using` clause.
// If the right child is a join tree, we need to handle it differently to make the precedence get right.
// Here is the example: t1 join t2 join t3
//                 JOIN ON t2.a = t3.a
//  t1    join    /    \
//              t2      t3
// (left)         (right)
//
// We can not build it directly to:
//         JOIN
//        /    \
//       t1	   JOIN ON t2.a = t3.a
//             /   \
//            t2    t3
// The precedence would be t1 join (t2 join t3 on t2.a=t3.a), not (t1 join t2) join t3 on t2.a=t3.a
// We need to find the left-most child of the right child, and build a cross join of the left-hand side
// of the left child(t1), and the right hand side with the original left-most child of the right child(t2).
//          JOIN t2.a = t3.a
//         /    \
//       JOIN    t3
//       /  \
//      t1  t2
// Besides, if the right handle side join tree's join type is right join, we need to rewrite it to left join.
// So t1 join t2 right join t3 would be rewrite to t1 join t3 left join t2.
// If not, t1 join (t2 right join t3) would be (t1 join t2) right join t3. After rewrite the right join to left join.
// We get (t1 join t3) left join t2, the semantics is correct.

Check List

Tests

  • Unit test

Code changes

Side effects

Related changes

Signed-off-by: wjhuang2016 <[email protected]>
@wjhuang2016 wjhuang2016 added the type/bug-fix fix bug label Dec 17, 2020
Copy link
Member

@winoros winoros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm


/* A dummy token to force the priority of TableRef production in a join. */
%left tableRefPriority
%left join straightJoin inner cross left right full natural
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be easier to make the JOIN operator right-associative? 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NewCrossJoin() will do the reorganization, make it left-associative.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i mean if you use %right here perhaps you don't need that NewCrossJoin() function

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then t1 right join t2 right join t3 would be t1 right join (t2 right join t3), it's not correct.

Copy link
Contributor

@kennytm kennytm Dec 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is that incorrect? a RIGHT JOIN b RIGHT JOIN c ON w1 ON w2 is indeed equivalent to a RIGHT JOIN (b RIGHT JOIN c ON w1) ON w2,
and the same for LEFT JOIN (a LEFT JOIN b LEFT JOIN c ON w1 ON w2 == a LEFT JOIN (b LEFT JOIN c ON w1) ON w2)

this is different from a RIGHT JOIN b ON w1 RIGHT JOIN c ON w2 which means (a RIGHT JOIN b ON w1) RIGHT JOIN c ON w2

however NATURAL JOIN needs to be left-associative (a NATURAL JOIN b NATURAL JOIN c == (a NATURAL JOIN b) NATURAL JOIN c)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, the example is wrong.
Consider select * from t1 join t2 right join t3 on t2.a=t3.a;, if using right-associative, we got
select * from t1 join (t2 right join t3 on t2.a=t3.a). They are not the same.

select * from t1 join t2 join t3;
mysql> select * from t1 join t2 right join t3 on t2.a=t3.a;
+------+------+------+
| a    | a    | a    |
+------+------+------+
| NULL | NULL |    3 |
+------+------+------+
1 row in set (0.00 sec)

mysql> select * from t1 join (t2 right join t3 on t2.a=t3.a);
+------+------+------+
| a    | a    | a    |
+------+------+------+
|    1 | NULL |    3 |
+------+------+------+
1 row in set (0.00 sec)

Signed-off-by: wjhuang2016 <[email protected]>
Signed-off-by: wjhuang2016 <[email protected]>
Copy link
Contributor

@tangenta tangenta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-srebot ti-srebot removed the status/LGT1 LGT1 label Dec 21, 2020
@ti-srebot ti-srebot added the status/LGT2 LGT2 label Dec 21, 2020
@kennytm
Copy link
Contributor

kennytm commented Dec 21, 2020

/merge

@ti-srebot
Copy link
Contributor

/run-all-tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

You have an error in your SQL syntax near "ON alias4.f2 = alias11.f2"
5 participants