-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
plan: change the logic of converting to index join #7553
Conversation
expression/util.go
Outdated
intSet := map[int]struct{}{} | ||
for _, col := range s1 { | ||
intSet[col.UniqueID] = struct{}{} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UniqueID is changed to int64 in #7385, should modify the map definition here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM in general
plan/physical_plan_test.go
Outdated
{ | ||
sql: "select /*+ TIDB_INLJ(t1) */ * from t t1 join t t2 where t1.a=t2.a and t1.a in(1, 2) and t2.a in (1, 2)", | ||
best: "IndexJoin{TableReader(Table(t))->TableReader(Table(t)->Sel([in(t2.a, 1, 2)]))}(t1.a,t2.a)", | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this test sql would not trigger the code path of this patch, i.e, getIndexJoinByOuterIdx -> buildRangeForIndexJoin
, because column a
is handle, it would follow getIndexJoinByOuterIdx -> constructInnerTableScan
. Should we change the column a
in the sql to column b
instead? or am I missing anything?
expression/util.go
Outdated
|
||
// ColumnSliceIntersect intersects two column slice. | ||
// You need to make sure that at least each element in s2 is unique. | ||
func ColumnSliceIntersect(s1, s2 []*Column) []*Column { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The caller only wants to know if there is any intersect, so just return bool
is enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, changed.
plan/exhaust_physical_plans.go
Outdated
@@ -527,7 +527,7 @@ func (p *LogicalJoin) buildRangeForIndexJoin(indexInfo *model.IndexInfo, innerPl | |||
// After constant propagation, there won'be cases that t1.a=t2.a and t2.a=1 occur in the same time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to update this comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems it was not updated.
plan/exhaust_physical_plans.go
Outdated
} | ||
|
||
usableKeys := make([]*expression.Column, 0, len(keys)) | ||
|
||
// After predicate push down, the one side conditions of join must be the conditions that cannot be pushed down and | ||
// cannot calculate range either. So we only need the innerPlan.pushedDownConds and the eq conditions that we generate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment should be updated, no innerPlan
now.
plan/exhaust_physical_plans.go
Outdated
// Int datum 1 can convert to all column's type(numeric type, string type, json, time type, enum, set) safely. | ||
fakeConstant := &expression.Constant{Value: types.NewIntDatum(1), RetType: key.GetType()} | ||
eqFunc := expression.NewFunctionInternal(p.ctx, ast.EQ, types.NewFieldType(mysql.TypeTiny), key, fakeConstant) | ||
conds = append(conds, eqFunc) | ||
eqConds = append(eqConds, eqFunc) | ||
} | ||
|
||
conds = append(conds, innerPlan.pushedDownConds...) | ||
return conds, eqConds, keyOff2IdxOff | ||
remained = make([]expression.Expression, 0, len(innerFilters)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may need a comment for this code block.
@@ -540,36 +540,48 @@ func (p *LogicalJoin) buildRangeForIndexJoin(indexInfo *model.IndexInfo, innerPl | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when will expression.Contains(accesses, eqCond)
be false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
t1.a=t2.a and t1.c=t2.c
, and index (t1.a, t1.b, t1.c). This case currently won't use index join.
plan/exhaust_physical_plans.go
Outdated
// And if there're cases like t1.a=t2.a and t1.a > 1, we can also guarantee that t1.a > 1 won't be chosen as access condition. | ||
// So DetachCondAndBuildRangeForIndex won't miss the equal conditions we generate. | ||
ranges, accesses, remained, _, err := ranger.DetachCondAndBuildRangeForIndex(p.ctx, conds, idxCols, colLengths) | ||
// In `buildFakeEqCondsForIndexJoin`, we construct the equal cond for join keys and remove filters that containing the join keys' column. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about:
In buildFakeEqCondsForIndexJoin
, we construct the equal conditions for join keys and remove filters that contain the join keys' column.
When t1.a = t2.a and t1.a > 1, we can also guarantee that t1.a > 1 won't be chosen as the access condition.
So the equal conditions we built can be successfully used to build a range if they can be used. They won't be affected by the existing filters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remained = append(remained, filter) | ||
continue | ||
} | ||
conds = append(conds, filter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this filter doesn't contain the column of join keys, can we just move all the filters in innerFilters
to remained
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
t1.a=t2.a and t1.b=1 index is (a, b).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/run-all-tests |
What problem does this PR solve?
Race occurs in race test. It's caused by the sql added in the unit-test.
Original sql is optimized to that one after #7276
Fix #7551
In such case, the index join may just be as well as hash join, not better than it. And even worse since it has redundant expression evaluation.
This pr mainly to make the hint work in this case.
What is changed and how it works?
Make the filters that index join uses to build range more strict.
Check List
Tests