-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
executor, util: vectorize hash calculation during probing #12669
Conversation
Codecov Report
@@ Coverage Diff @@
## master #12669 +/- ##
===========================================
Coverage 80.4487% 80.4487%
===========================================
Files 468 468
Lines 112898 112898
===========================================
Hits 90825 90825
Misses 15167 15167
Partials 6906 6906 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution!
I wonder why there is no performance gain.
How many rows in the outer table in your test?
PTAL @sduzh
executor/join.go
Outdated
|
||
hCtx.initHash(outerChk.NumRows()) | ||
for _, i := range hCtx.keyColIdx { | ||
err = codec.HashChunkSelected(e.rowContainer.sc, hCtx.hashVals, outerChk, hCtx.allTypes[i], i, hCtx.buf, hCtx.hasNull, selected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
err = codec.HashChunkSelected(e.rowContainer.sc, hCtx.hashVals, outerChk, hCtx.allTypes[i], i, hCtx.buf, hCtx.hasNull, selected) | |
err = codec.HashChunkSelected(e.ctx, hCtx.hashVals, outerChk, hCtx.allTypes[i], i, hCtx.buf, hCtx.hasNull, selected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation of HashChunkColumns
relies on HashChunkSelected
.
It is inconvenient to pass sessionctx.Context
from HashChunkColumns
.
Hi, @sduzh May you help do a little more profiling? Use
Also, you may just target this test |
Thanks for your advice, I will try it tonight. |
@SunRunAway
results on master branch
|
It looks like the cost of hash calculation takes only a small fraction of the total cost of the hash probing phase, less than 1% in both branches. |
Sorry for the late reply, I'm really busy in last week. And thanks for your work, the result and conclusion you post is really helpful. tidb/executor/benchmark_test.go Line 542 in cc991d9
The column of index 1 is a string type, which the size of it is 5K). Could you try to design a new test case? |
Benchmark Report
@@ Benchmark Diff @@
================================================================================
--- tidb: 5d5497bfeb4172ac97016ea632f0c36b1a6d8457
+++ tidb: b5187fd172df965f11c33d672e6e513dbad0fe39
tikv: 57aa7ba9edeb4132ba8bcef8ec916952ad343b0f
pd: 92a4c01b346e5190f4f884c4fe01af913a343a6c
================================================================================
oltp_update_non_index:
* QPS: 4863.92 ± 0.24% (std=7.83) delta: -0.11% (p=0.373)
* Latency p50: 26.31 ± 0.24% (std=0.04) delta: 0.11%
* Latency p99: 42.64 ± 5.32% (std=1.62) delta: 3.72%
oltp_insert:
* QPS: 3804.40 ± 0.89% (std=27.93) delta: -0.87% (p=0.171)
* Latency p50: 33.64 ± 0.88% (std=0.25) delta: 0.88%
* Latency p99: 78.13 ± 1.20% (std=0.66) delta: 0.30%
oltp_read_write:
* QPS: 16263.35 ± 0.32% (std=42.76) delta: 0.04% (p=0.836)
* Latency p50: 157.78 ± 0.35% (std=0.37) delta: 0.03%
* Latency p99: 313.07 ± 2.27% (std=4.69) delta: -2.23%
oltp_update_index:
* QPS: 4369.31 ± 0.23% (std=7.19) delta: 0.09% (p=0.807)
* Latency p50: 29.29 ± 0.22% (std=0.04) delta: -0.11%
* Latency p99: 55.84 ± 3.64% (std=1.24) delta: 1.23%
oltp_point_select:
* QPS: 37668.72 ± 0.45% (std=129.26) delta: 0.16% (p=0.612)
* Latency p50: 3.40 ± 0.44% (std=0.01) delta: -0.15%
* Latency p99: 10.65 ± 0.00% (std=0.00) delta: 0.00%
|
No problem. |
Benchmark Report
@@ Benchmark Diff @@
================================================================================
--- tidb: c6d284e1de82635e2822563ee59290ce2ccc32fc
+++ tidb: b5187fd172df965f11c33d672e6e513dbad0fe39
tikv: 57aa7ba9edeb4132ba8bcef8ec916952ad343b0f
pd: 92a4c01b346e5190f4f884c4fe01af913a343a6c
================================================================================
oltp_update_non_index:
* QPS: 4866.90 ± 0.23% (std=7.34) delta: -0.03% (p=0.729)
* Latency p50: 26.30 ± 0.22% (std=0.04) delta: 0.04%
* Latency p99: 41.68 ± 4.11% (std=1.12) delta: 1.41%
oltp_insert:
* QPS: 4679.99 ± 0.31% (std=10.46) delta: 0.23% (p=0.649)
* Latency p50: 27.34 ± 0.32% (std=0.06) delta: -0.25%
* Latency p99: 51.29 ± 4.99% (std=2.11) delta: 2.68%
oltp_read_write:
* QPS: 15129.68 ± 0.15% (std=15.67) delta: -0.27% (p=0.919)
* Latency p50: 169.56 ± 0.12% (std=0.15) delta: 0.31%
* Latency p99: 318.26 ± 1.20% (std=2.70) delta: 0.00%
oltp_update_index:
* QPS: 4373.52 ± 0.20% (std=6.79) delta: -0.01% (p=0.410)
* Latency p50: 29.26 ± 0.19% (std=0.05) delta: -0.03%
* Latency p99: 54.84 ± 3.56% (std=1.20) delta: 0.02%
oltp_point_select:
* QPS: 43412.50 ± 0.79% (std=207.43) delta: -0.16% (p=0.726)
* Latency p50: 2.95 ± 0.76% (std=0.01) delta: 0.08%
* Latency p99: 9.50 ± 1.19% (std=0.08) delta: -1.03%
|
Friendly ping, is this PR still work in process? @sduzh |
Sorry for the late reply.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/run-all-tests |
Great job, you can post the newest benchmark result into the PR description. |
/bench +tpch |
Benchmark Report
@@ Benchmark Diff @@
================================================================================
--- tidb: 08d26a3be1b88ccfd277b134f9f1687acab16a11
+++ tidb: f6d0b4df3ed569fdbe3606b8f428a5cac6d7a99c
tikv: a9c108185f79e2fd7dfbfcaad762b9413c99ce8a
pd: 17383d9ffdfd47baec03c6166642e7b137d19840
================================================================================
01.sql duration: 19842.79 ms ± 1.37% (std=272.12) delta: -5.22% (p=0.553)
02.sql duration: 7831.95 ms ± 1.83% (std=143.13) delta: -1.67% (p=0.623)
03.sql duration: 18403.03 ms ± 0.24% (std=43.40) delta: 0.44% (p=0.867)
04.sql duration: 8190.27 ms ± 2.16% (std=177.18) delta: 4.82% (p=0.237)
06.sql duration: 11367.20 ms ± 0.65% (std=74.13) delta: -4.30% (p=0.086)
07.sql duration: 18114.63 ms ± 2.23% (std=403.27) delta: -2.37% (p=0.467)
08.sql duration: 12945.07 ms ± 1.17% (std=151.50) delta: -3.43% (p=0.199)
09.sql duration: 31994.01 ms ± 0.49% (std=155.56) delta: -3.84% (p=0.123)
10.sql duration: 10331.48 ms ± 1.18% (std=122.01) delta: -3.25% (p=0.199)
11.sql duration: 3765.94 ms ± 3.92% (std=147.75) delta: 2.52% (p=0.644)
12.sql duration: 13707.02 ms ± 3.08% (std=422.02) delta: 0.89% (p=0.836)
13.sql duration: 9089.68 ms ± 1.88% (std=170.87) delta: 0.15% (p=0.955)
14.sql duration: 14037.29 ms ± 1.60% (std=224.35) delta: 2.24% (p=0.389)
15.sql duration: 23276.33 ms ± 0.69% (std=161.64) delta: -2.99% (p=0.142)
16.sql duration: 3486.62 ms ± 3.04% (std=106.01) delta: -7.43% (p=0.628)
17.sql duration: 37964.56 ms ± 1.16% (std=439.20) delta: 3.78% (p=0.181)
18.sql duration: 55353.45 ms ± 0.39% (std=216.71) delta: -0.73% (p=0.800)
19.sql duration: 17942.35 ms ± 0.99% (std=177.83) delta: 0.41% (p=0.837)
20.sql duration: 13066.79 ms ± 0.28% (std=37.24) delta: -5.57% (p=0.405)
21.sql duration: 36382.42 ms ± 0.45% (std=163.61) delta: 1.10% (p=0.207)
22.sql duration: 5004.16 ms ± 0.13% (std=6.65) delta: 0.81% (p=0.774) |
Is this expected? |
No |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, please resolve the conflicts
/merge |
/run-all-tests |
@sduzh merge failed. |
/merge |
/run-all-tests |
@sduzh merge failed. |
/run-unit-test |
Thank you, @sduzh. |
What problem does this PR solve?
Fix #12048
What is changed and how it works?
Check List
Tests
Code changes
Side effects
Related changes
Release note