-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
planner: fix incorrect estRows with global index and json column #55842
Conversation
Signed-off-by: Weizhen Wang <[email protected]>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #55842 +/- ##
=================================================
- Coverage 72.8695% 56.7427% -16.1269%
=================================================
Files 1588 1759 +171
Lines 443554 647686 +204132
=================================================
+ Hits 323216 367515 +44299
- Misses 100456 255197 +154741
- Partials 19882 24974 +5092
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Signed-off-by: Weizhen Wang <[email protected]>
Signed-off-by: Weizhen Wang <[email protected]>
Signed-off-by: Weizhen Wang <[email protected]>
Signed-off-by: Weizhen Wang <[email protected]>
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the root cause here is that the global index (idx_10
and idx_9
) here is not analyzed actually, but they are marked as analyzed (according to ColAndIdxExistenceMap
) and fully loaded (according to StatsLoadedStatus
).
So we will try to use this here, then the idxStats.TotalRowCount()
is 0 in GetScaledRealtimeAndModifyCnt()
, causing the problem.
The wrong result has been fixed. BTW, |
MV index has stats, but |
Signed-off-by: Weizhen Wang <[email protected]>
I have updated it again. |
Signed-off-by: Weizhen Wang <[email protected]>
/retest |
Signed-off-by: Weizhen Wang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please restore tidb_enable_global_index
after the tests.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: qw4990, time-and-fate The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
/hold |
Signed-off-by: Weizhen Wang <[email protected]>
/unhold |
/retest |
/cherrypick release-8.1 |
@hawkingrei: new pull request created to branch In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
Signed-off-by: ti-chi-bot <[email protected]>
Signed-off-by: ti-chi-bot <[email protected]>
@hawkingrei: new pull request created to branch In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
* origin/master: (33 commits) build(deps): bump github.com/prometheus/common from 0.55.0 to 0.57.0 (pingcap#55762) build(deps): bump golang.org/x/sys from 0.24.0 to 0.25.0 (pingcap#55894) planner: fix incorrect estRows with global index and json column (pingcap#55842) ddl, stat: switch to new struct for add/truncate/drop partition (pingcap#55893) planner: hide instance plan cache eviction log if no plan is evicted (pingcap#55918) expression: support tidb encode key function (pingcap#51678) planner: fix incorrect maintenance of `handleColHelper` for recursive CTE (pingcap#55732) executor: some code refine of hash join v2 (pingcap#55887) infoschema, meta: fix wrong auto id after `rename table` (pingcap#55847) ddl/ingest: set `minCommitTS` when detect remote duplicate keys (pingcap#55588) planner: move index advisor into the kernel (pingcap#55874) ddl, stats: switch to new struct for create/truncate table (pingcap#55811) executor: avoid new small objects in probe stage of hash join v2 (pingcap#55855) *: Add tidbCPU/tikvCPU into system tables (pingcap#55455) ddl: use static contexts in `NewReorgCopContext` (pingcap#55823) executor: fix index out of range bug in hash join v2 (pingcap#55864) executor: record index usage for the clustered primary keys (pingcap#55602) domain: load all non-public tables into infoschema (pingcap#55142) test: fix unstable TestShowViewPriv (pingcap#55868) ttl: add `varbinary` case for `TestSplitTTLScanRangesWithBytes` (pingcap#55863) ...
What problem does this PR solve?
Issue Number: close #55818
Problem Summary:
What changed and how does it work?
Because the JSON column will be without stats by default, the row count will be 0. But when calculating selectivity, we cannot divide by zero. it will get NaN row count.
why is the result wrong?
It will return
NaN
fromSelectivity
and break all stats info byds.TableStats.Scale(selectivity)
Final, The problem will be created at the
joinReorderGreedySolver.solve
->constructConnectedJoinTree
checkConnectionAndMakeJoin
will create a right join plan. but when to calculate the cost, it will getNaN
because of the pollute stats info. we cannot get best join because ofNan > math.MaxFloat64
.makeBushyJoin
will not consider theEqCondtion
. Many condition will lose. This is the reason why it will get wrong result.Check List
Tests
Due to incorrect statistics, the filter condition was completely removed.
before
after
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.