-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
executor: fix hash join between datetime and timestamp #25915
Conversation
No release note, Please follow https://github.com/pingcap/community/blob/master/contributors/release-note-checker.md |
Timestamp hash table cannot be encoded by utc zone, because it's not a persistent table, and used by hash join. Is that used by any other places? |
@wshwsh12 @tiancaiamao |
I can help to review it. |
I'll take a look. |
@@ -330,14 +330,7 @@ func encodeHashChunkRowIdx(sc *stmtctx.StatementContext, row chunk.Row, tp *type | |||
case mysql.TypeDate, mysql.TypeDatetime, mysql.TypeTimestamp: | |||
flag = uintFlag | |||
t := row.GetTime(idx) | |||
// Encoding timestamp need to consider timezone. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here is right, the decoding process should take the timezone into consideration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we do not store the timezone information in the TiKV, all timestamp data is stored as UTC.
They are converted to UTC first, and then into a uint64
The read back process should convert the uint64 to the correct timestamp considering the timezone
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Timestamp hash table cannot be encoded by utc zone, because it's not a persistent table, and used by hash join. Is that used by any other places?
As I see, the hash table is not a persistent table, It's used to do join operator after being getted from kv and converted to the right timezone, other than store in kv.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For instance, this sql can produce the right result, because we do not convert timezone when do join. Timezone convertion should be considered when it was getted from kv.
select /*+ MERGE_JOIN(tt1, tt2) */ * from tt1 inner join tt2 on tt1.ts=tt2.ts;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly
Here is what we need to change: Lines 333 to 340 in cb06e11
Lines 510 to 517 in cb06e11
Could you help us remove it in this pull request, and also add a test like that:
Bug fix should comes with a unit test to prevent regression. @sylzd |
Explain: tikv doesn't store the time_zone information, so the timestamp is converted to UTC first before stored.
The bug of that issue is, the loading to chunk process had already handle the time_zone converting, so when hash join key comparing, the converting should not be done again. Line 127 in cb06e11
|
yes, that's what I mean~ |
@mmyj PTAL~ |
executor/join_test.go
Outdated
|
||
func (s *testSuiteJoinSerial) TestIssue25902(c *C) { | ||
tk := testkit.NewTestKitWithInit(c, s.store) | ||
tk.MustExec("drop table if exists tt1,tt2; ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing tt3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed!
tk.MustExec("insert into tt3 values (\"2001-01-01 00:00:00\");") | ||
tk.MustQuery("select * from tt1 where ts in (select ts from tt2);").Check(testkit.Rows("2001-01-01 00:00:00")) | ||
tk.MustQuery("select * from tt1 where ts in (select ts from tt3);").Check(testkit.Rows("2001-01-01 00:00:00")) | ||
tk.MustExec("set @@session.time_zone = '+10:00';") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we reset to time_zone
to +08
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx, it fixed~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
/merge |
This pull request has been accepted and is ready to merge. Commit hash: 66fa1bd
|
/label needs-cherry-pick-4.0 |
Signed-off-by: ti-srebot <[email protected]>
cherry pick to release-4.0 in PR #25988 |
Signed-off-by: ti-srebot <[email protected]>
cherry pick to release-5.0 in PR #25990 |
Signed-off-by: ti-srebot <[email protected]>
cherry pick to release-5.1 in PR #25991 |
What problem does this PR solve?
Issue Number: close #25902
Problem Summary:
What is changed and how it works?
Proposal: xxx
What's Changed:
How it Works:
Related changes
pingcap/docs
/pingcap/docs-cn
:Check List
Tests
Side effects
Release note