Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

executor, util: vectorize hash calculation during probing #12669

Merged
merged 4 commits into from
Nov 5, 2019
Merged

executor, util: vectorize hash calculation during probing #12669

merged 4 commits into from
Nov 5, 2019

Conversation

sduzh
Copy link
Contributor

@sduzh sduzh commented Oct 13, 2019

What problem does this PR solve?

Fix #12048

What is changed and how it works?

name                                                                       old time/op  new time/op  delta
HashJoinExec/(rows:100000,_concurency:4,_joinKeyIdx:_[0_1],_disk:false)-8   716ms ± 1%   717ms ± 3%    ~     (p=1.000 n=16+20)
HashJoinExec/(rows:100000,_concurency:4,_joinKeyIdx:_[0],_disk:false)-8     102ms ± 2%   100ms ± 1%  -1.16%  (p=0.000 n=20+18)
HashJoinExec/(rows:100000,_concurency:4,_joinKeyIdx:_[0],_disk:true)-8      708ms ±13%   700ms ±11%    ~     (p=0.640 n=20+20)
HashJoinExec/(rows:1000,_concurency:4,_joinKeyIdx:_[0],_disk:true)-8       17.8ms ±20%  17.9ms ±15%    ~     (p=0.968 n=20+20)

Check List

Tests

  • Unit test

Code changes

  • Has exported function/method change

Side effects

  • Possible performance regression
  • Increased code complexity

Related changes

Release note

  • Write release note for bug-fix or new feature.

@sre-bot sre-bot added the contribution This PR is from a community contributor. label Oct 13, 2019
@codecov
Copy link

codecov bot commented Oct 13, 2019

Codecov Report

Merging #12669 into master will not change coverage.
The diff coverage is n/a.

@@             Coverage Diff             @@
##             master     #12669   +/-   ##
===========================================
  Coverage   80.4487%   80.4487%           
===========================================
  Files           468        468           
  Lines        112898     112898           
===========================================
  Hits          90825      90825           
  Misses        15167      15167           
  Partials       6906       6906

Copy link
Contributor

@qw4990 qw4990 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution!
I wonder why there is no performance gain.
How many rows in the outer table in your test?
PTAL @sduzh

@winoros winoros changed the title expression: vectorize hash calculation during probing (#12048) executor, util: vectorize hash calculation during probing (#12048) Oct 15, 2019
@SunRunAway SunRunAway changed the title executor, util: vectorize hash calculation during probing (#12048) executor, util: vectorize hash calculation during probing Oct 17, 2019
executor/join.go Outdated

hCtx.initHash(outerChk.NumRows())
for _, i := range hCtx.keyColIdx {
err = codec.HashChunkSelected(e.rowContainer.sc, hCtx.hashVals, outerChk, hCtx.allTypes[i], i, hCtx.buf, hCtx.hasNull, selected)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
err = codec.HashChunkSelected(e.rowContainer.sc, hCtx.hashVals, outerChk, hCtx.allTypes[i], i, hCtx.buf, hCtx.hasNull, selected)
err = codec.HashChunkSelected(e.ctx, hCtx.hashVals, outerChk, hCtx.allTypes[i], i, hCtx.buf, hCtx.hasNull, selected)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation of HashChunkColumns relies on HashChunkSelected.
It is inconvenient to pass sessionctx.Context from HashChunkColumns.

@SunRunAway
Copy link
Contributor

Hi, @sduzh
Thank you for your PR, your code looks great.

May you help do a little more profiling?

Use go test -run=XX -bench="BenchmarkHashJoinExec" -benchmem -count 5 -memprofile memprofile.out -cpuprofile profile.out to collect profile.out

go tool pprof -http=":8080" profile.out to analyze through flame graph, or use go tool pprof profile.out and list command in CLI to analyze the cumulative cost of hash calculation during probing. Maybe you would like to change the function name of getJoinKeyFromChkRow for probing to get the cumulative cost during probing in old code.

Also, you may just target this test HashJoinExec/(rows:100000,_concurency:4,_joinKeyIdx:_[0],_disk:false)-8, and temporarily disable the others.

util/codec/codec.go Outdated Show resolved Hide resolved
@sduzh
Copy link
Contributor Author

sduzh commented Oct 17, 2019

Hi, @sduzh
Thank you for your PR, your code looks great.

May you help do a little more profiling?

Use go test -run=XX -bench="BenchmarkHashJoinExec" -benchmem -count 5 -memprofile memprofile.out -cpuprofile profile.out to collect profile.out

go tool pprof -http=":8080" profile.out to analyze through flame graph, or use go tool pprof profile.out and list command in CLI to analyze the cumulative cost of hash calculation during probing. Maybe you would like to change the function name of getJoinKeyFromChkRow for probing to get the cumulative cost during probing in old code.

Also, you may just target this test HashJoinExec/(rows:100000,_concurency:4,_joinKeyIdx:_[0],_disk:false)-8, and temporarily disable the others.

Thanks for your advice, I will try it tonight.

@sduzh
Copy link
Contributor Author

sduzh commented Oct 17, 2019

@SunRunAway
results on this new branch

➜  executor git:(vec-hash-probe) ✗ go tool pprof profile.out
Type: cpu
Time: Oct 18, 2019 at 1:27am (CST)
Duration: 23.05s, Total samples = 38.50s (167.05%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) list join2Chunk
Total: 38.50s
ROUTINE ======================== github.com/pingcap/tidb/executor.(*HashJoinExec).join2Chunk in /Users/zhuming/go/src/github.com/sduzh/tidb/executor/join.go
     130ms     10.07s (flat, cum) 26.16% of Total
         .          .    421:		return false, joinResult
         .          .    422:	}
         .          .    423:
         .          .    424:	hCtx.initHash(outerChk.NumRows())
         .          .    425:	for _, i := range hCtx.keyColIdx {
         .       20ms    426:		err = codec.HashChunkSelected(e.rowContainer.sc, hCtx.hashVals, outerChk, hCtx.allTypes[i], i, hCtx.buf, hCtx.hasNull, selected)
         .          .    427:		if err != nil {
         .          .    428:			joinResult.err = err
         .          .    429:			return false, joinResult
         .          .    430:		}
         .          .    431:	}
         .          .    432:
      20ms       20ms    433:	for i := range selected {
      50ms       50ms    434:		if !selected[i] || hCtx.hasNull[i] { // process unmatched outer rows
         .          .    435:			e.joiners[workerID].onMissMatch(false, outerChk.GetRow(i), joinResult.chk)
         .          .    436:		} else { // process matched outer rows
      30ms       50ms    437:			probeKey, probeRow := hCtx.hashVals[i].Sum64(), outerChk.GetRow(i)
      30ms      9.93s    438:			ok, joinResult = e.joinMatchedOuterRow2Chunk(workerID, probeKey, probeRow, hCtx, joinResult)
         .          .    439:			if !ok {
         .          .    440:				return false, joinResult
         .          .    441:			}
         .          .    442:		}
         .          .    443:		if joinResult.chk.IsFull() {

results on master branch

➜  executor git:(master) ✗ go tool pprof profile.out.old
Type: cpu
Time: Oct 18, 2019 at 1:34am (CST)
Duration: 20.49s, Total samples = 33.69s (164.45%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) list join2Chunk
Total: 33.69s
ROUTINE ======================== github.com/pingcap/tidb/executor.(*HashJoinExec).join2Chunk in /Users/zhuming/go/src/github.com/sduzh/tidb/executor/join.go
      30ms      9.85s (flat, cum) 29.24% of Total
         .          .    412:}
         .          .    413:
         .          .    414:func (e *HashJoinExec) join2Chunk(workerID uint, outerChk *chunk.Chunk, hCtx *hashContext, joinResult *hashjoinWorkerResult,
         .          .    415:	selected []bool) (ok bool, _ *hashjoinWorkerResult) {
         .          .    416:	var err error
      10ms       20ms    417:	selected, err = expression.VectorizedFilter(e.ctx, e.outerFilter, chunk.NewIterator4Chunk(outerChk), selected)
         .          .    418:	if err != nil {
         .          .    419:		joinResult.err = err
         .          .    420:		return false, joinResult
         .          .    421:	}
         .          .    422:	for i := range selected {
      10ms       10ms    423:		if !selected[i] { // process unmatched outer rows
         .          .    424:			e.joiners[workerID].onMissMatch(false, outerChk.GetRow(i), joinResult.chk)
         .          .    425:		} else { // process matched outer rows
      10ms      9.82s    426:			ok, joinResult = e.joinMatchedOuterRow2Chunk(workerID, outerChk.GetRow(i), hCtx, joinResult)
         .          .    427:			if !ok {
         .          .    428:				return false, joinResult
         .          .    429:			}
         .          .    430:		}
         .          .    431:		if joinResult.chk.IsFull() {
(pprof) list getJoinKeyFromChkRow
Total: 33.69s
ROUTINE ======================== github.com/pingcap/tidb/executor.(*hashRowContainer).getJoinKeyFromChkRowX in /Users/zhuming/go/src/github.com/sduzh/tidb/executor/hash_table.go
      30ms      290ms (flat, cum)  0.86% of Total
         .          .    239:}
         .          .    240:
         .          .    241:// getJoinKeyFromChkRow fetches join keys from row and calculate the hash value.
         .          .    242:func (*hashRowContainer) getJoinKeyFromChkRow(sc *stmtctx.StatementContext, row chunk.Row, hCtx *hashContext) (hasNull bool, key uint64, err error) {
         .          .    243:	for _, i := range hCtx.keyColIdx {
         .       10ms    244:		if row.IsNull(i) {
         .          .    245:			return true, 0, nil
         .          .    246:		}
         .          .    247:	}
      10ms       50ms    248:	hCtx.initHash(1)
      10ms      220ms    249:	err = codec.HashChunkRow(sc, hCtx.hashVals[0], row, hCtx.allTypes, hCtx.keyColIdx, hCtx.buf)
      10ms       10ms    250:	return false, hCtx.hashVals[0].Sum64(), err
         .          .    251:}
         .          .    252:
         .          .    253:// Len returns the length of the records in hashRowContainer.
         .          .    254:func (c hashRowContainer) Len() int {
         .          .    255:	return c.hashTable.Len()
(pprof)

@sduzh
Copy link
Contributor Author

sduzh commented Oct 21, 2019

@SunRunAway
results on this new branch

➜  executor git:(vec-hash-probe) ✗ go tool pprof profile.out
Type: cpu
Time: Oct 18, 2019 at 1:27am (CST)
Duration: 23.05s, Total samples = 38.50s (167.05%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) list join2Chunk
Total: 38.50s
ROUTINE ======================== github.com/pingcap/tidb/executor.(*HashJoinExec).join2Chunk in /Users/zhuming/go/src/github.com/sduzh/tidb/executor/join.go
     130ms     10.07s (flat, cum) 26.16% of Total
         .          .    421:		return false, joinResult
         .          .    422:	}
         .          .    423:
         .          .    424:	hCtx.initHash(outerChk.NumRows())
         .          .    425:	for _, i := range hCtx.keyColIdx {
         .       20ms    426:		err = codec.HashChunkSelected(e.rowContainer.sc, hCtx.hashVals, outerChk, hCtx.allTypes[i], i, hCtx.buf, hCtx.hasNull, selected)
         .          .    427:		if err != nil {
         .          .    428:			joinResult.err = err
         .          .    429:			return false, joinResult
         .          .    430:		}
         .          .    431:	}
         .          .    432:
      20ms       20ms    433:	for i := range selected {
      50ms       50ms    434:		if !selected[i] || hCtx.hasNull[i] { // process unmatched outer rows
         .          .    435:			e.joiners[workerID].onMissMatch(false, outerChk.GetRow(i), joinResult.chk)
         .          .    436:		} else { // process matched outer rows
      30ms       50ms    437:			probeKey, probeRow := hCtx.hashVals[i].Sum64(), outerChk.GetRow(i)
      30ms      9.93s    438:			ok, joinResult = e.joinMatchedOuterRow2Chunk(workerID, probeKey, probeRow, hCtx, joinResult)
         .          .    439:			if !ok {
         .          .    440:				return false, joinResult
         .          .    441:			}
         .          .    442:		}
         .          .    443:		if joinResult.chk.IsFull() {

results on master branch

➜  executor git:(master) ✗ go tool pprof profile.out.old
Type: cpu
Time: Oct 18, 2019 at 1:34am (CST)
Duration: 20.49s, Total samples = 33.69s (164.45%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) list join2Chunk
Total: 33.69s
ROUTINE ======================== github.com/pingcap/tidb/executor.(*HashJoinExec).join2Chunk in /Users/zhuming/go/src/github.com/sduzh/tidb/executor/join.go
      30ms      9.85s (flat, cum) 29.24% of Total
         .          .    412:}
         .          .    413:
         .          .    414:func (e *HashJoinExec) join2Chunk(workerID uint, outerChk *chunk.Chunk, hCtx *hashContext, joinResult *hashjoinWorkerResult,
         .          .    415:	selected []bool) (ok bool, _ *hashjoinWorkerResult) {
         .          .    416:	var err error
      10ms       20ms    417:	selected, err = expression.VectorizedFilter(e.ctx, e.outerFilter, chunk.NewIterator4Chunk(outerChk), selected)
         .          .    418:	if err != nil {
         .          .    419:		joinResult.err = err
         .          .    420:		return false, joinResult
         .          .    421:	}
         .          .    422:	for i := range selected {
      10ms       10ms    423:		if !selected[i] { // process unmatched outer rows
         .          .    424:			e.joiners[workerID].onMissMatch(false, outerChk.GetRow(i), joinResult.chk)
         .          .    425:		} else { // process matched outer rows
      10ms      9.82s    426:			ok, joinResult = e.joinMatchedOuterRow2Chunk(workerID, outerChk.GetRow(i), hCtx, joinResult)
         .          .    427:			if !ok {
         .          .    428:				return false, joinResult
         .          .    429:			}
         .          .    430:		}
         .          .    431:		if joinResult.chk.IsFull() {
(pprof) list getJoinKeyFromChkRow
Total: 33.69s
ROUTINE ======================== github.com/pingcap/tidb/executor.(*hashRowContainer).getJoinKeyFromChkRowX in /Users/zhuming/go/src/github.com/sduzh/tidb/executor/hash_table.go
      30ms      290ms (flat, cum)  0.86% of Total
         .          .    239:}
         .          .    240:
         .          .    241:// getJoinKeyFromChkRow fetches join keys from row and calculate the hash value.
         .          .    242:func (*hashRowContainer) getJoinKeyFromChkRow(sc *stmtctx.StatementContext, row chunk.Row, hCtx *hashContext) (hasNull bool, key uint64, err error) {
         .          .    243:	for _, i := range hCtx.keyColIdx {
         .       10ms    244:		if row.IsNull(i) {
         .          .    245:			return true, 0, nil
         .          .    246:		}
         .          .    247:	}
      10ms       50ms    248:	hCtx.initHash(1)
      10ms      220ms    249:	err = codec.HashChunkRow(sc, hCtx.hashVals[0], row, hCtx.allTypes, hCtx.keyColIdx, hCtx.buf)
      10ms       10ms    250:	return false, hCtx.hashVals[0].Sum64(), err
         .          .    251:}
         .          .    252:
         .          .    253:// Len returns the length of the records in hashRowContainer.
         .          .    254:func (c hashRowContainer) Len() int {
         .          .    255:	return c.hashTable.Len()
(pprof)

It looks like the cost of hash calculation takes only a small fraction of the total cost of the hash probing phase, less than 1% in both branches.

@SunRunAway
Copy link
Contributor

SunRunAway commented Oct 23, 2019

Sorry for the late reply, I'm really busy in last week.

And thanks for your work, the result and conclusion you post is really helpful.
I think it's the problem of the test case.
The test case must join two tables which both have wide columns (see

{Index: 1, RetType: types.NewFieldType(mysql.TypeVarString)},

The column of index 1 is a string type, which the size of it is 5K).
Could you try to design a new test case?

@sre-bot
Copy link
Contributor

sre-bot commented Oct 23, 2019

Benchmark Report

Run Sysbench Performance Test on VMs

@@                               Benchmark Diff                               @@
================================================================================
--- tidb: 5d5497bfeb4172ac97016ea632f0c36b1a6d8457
+++ tidb: b5187fd172df965f11c33d672e6e513dbad0fe39
tikv: 57aa7ba9edeb4132ba8bcef8ec916952ad343b0f
pd: 92a4c01b346e5190f4f884c4fe01af913a343a6c
================================================================================
oltp_update_non_index:
    * QPS: 4863.92 ± 0.24% (std=7.83) delta: -0.11% (p=0.373)
    * Latency p50: 26.31 ± 0.24% (std=0.04) delta: 0.11%
    * Latency p99: 42.64 ± 5.32% (std=1.62) delta: 3.72%
            
oltp_insert:
    * QPS: 3804.40 ± 0.89% (std=27.93) delta: -0.87% (p=0.171)
    * Latency p50: 33.64 ± 0.88% (std=0.25) delta: 0.88%
    * Latency p99: 78.13 ± 1.20% (std=0.66) delta: 0.30%
            
oltp_read_write:
    * QPS: 16263.35 ± 0.32% (std=42.76) delta: 0.04% (p=0.836)
    * Latency p50: 157.78 ± 0.35% (std=0.37) delta: 0.03%
    * Latency p99: 313.07 ± 2.27% (std=4.69) delta: -2.23%
            
oltp_update_index:
    * QPS: 4369.31 ± 0.23% (std=7.19) delta: 0.09% (p=0.807)
    * Latency p50: 29.29 ± 0.22% (std=0.04) delta: -0.11%
    * Latency p99: 55.84 ± 3.64% (std=1.24) delta: 1.23%
            
oltp_point_select:
    * QPS: 37668.72 ± 0.45% (std=129.26) delta: 0.16% (p=0.612)
    * Latency p50: 3.40 ± 0.44% (std=0.01) delta: -0.15%
    * Latency p99: 10.65 ± 0.00% (std=0.00) delta: 0.00%
            

@sduzh
Copy link
Contributor Author

sduzh commented Oct 23, 2019

Sorry for the late reply, I'm really busy in last week.

And thanks for your work, the result and conclusion you post is really helpful.
I think it's the problem of the test case.
The test case must join two tables which both have wide columns (see

{Index: 1, RetType: types.NewFieldType(mysql.TypeVarString)},

The column of index 1 is a string type, which the size of it is 5K).
Could you try to design a new test case?

No problem.

@sre-bot
Copy link
Contributor

sre-bot commented Oct 23, 2019

Benchmark Report

Run Sysbench Performance Test on VMs

@@                               Benchmark Diff                               @@
================================================================================
--- tidb: c6d284e1de82635e2822563ee59290ce2ccc32fc
+++ tidb: b5187fd172df965f11c33d672e6e513dbad0fe39
tikv: 57aa7ba9edeb4132ba8bcef8ec916952ad343b0f
pd: 92a4c01b346e5190f4f884c4fe01af913a343a6c
================================================================================
oltp_update_non_index:
    * QPS: 4866.90 ± 0.23% (std=7.34) delta: -0.03% (p=0.729)
    * Latency p50: 26.30 ± 0.22% (std=0.04) delta: 0.04%
    * Latency p99: 41.68 ± 4.11% (std=1.12) delta: 1.41%
            
oltp_insert:
    * QPS: 4679.99 ± 0.31% (std=10.46) delta: 0.23% (p=0.649)
    * Latency p50: 27.34 ± 0.32% (std=0.06) delta: -0.25%
    * Latency p99: 51.29 ± 4.99% (std=2.11) delta: 2.68%
            
oltp_read_write:
    * QPS: 15129.68 ± 0.15% (std=15.67) delta: -0.27% (p=0.919)
    * Latency p50: 169.56 ± 0.12% (std=0.15) delta: 0.31%
    * Latency p99: 318.26 ± 1.20% (std=2.70) delta: 0.00%
            
oltp_update_index:
    * QPS: 4373.52 ± 0.20% (std=6.79) delta: -0.01% (p=0.410)
    * Latency p50: 29.26 ± 0.19% (std=0.05) delta: -0.03%
    * Latency p99: 54.84 ± 3.56% (std=1.20) delta: 0.02%
            
oltp_point_select:
    * QPS: 43412.50 ± 0.79% (std=207.43) delta: -0.16% (p=0.726)
    * Latency p50: 2.95 ± 0.76% (std=0.01) delta: 0.08%
    * Latency p99: 9.50 ± 1.19% (std=0.08) delta: -1.03%
            

@XuHuaiyu
Copy link
Contributor

Friendly ping, is this PR still work in process? @sduzh

@XuHuaiyu XuHuaiyu removed their request for review October 29, 2019 02:32
@sduzh
Copy link
Contributor Author

sduzh commented Oct 29, 2019

Sorry for the late reply.
As @SunRunAway mentioned above, the original test case must join the wide string columns, which result in a lot of string copies.
So I designed a new test case, as suggested by @SunRunAway, replace the string columns with double columns, and the result is much better:

➜  executor git:(vec-hash-probe) ✗ benchstat old.txt new.txt
name                                                                                                        old time/op  new time/op  delta
HashJoinExec/(rows:100000,_cols:[[bigint(20)_double],_concurency:4,_joinKeyIdx:_[0_1],_disk:false)-8         28.7ms ± 1%  23.5ms ± 1%  -18.10%  (p=0.008 n=5+5)
HashJoinExec/(rows:100000,_cols:[bigint(20)_double],_concurency:4,_joinKeyIdx:_[0],_disk:false)-8           26.2ms ± 3%  22.8ms ± 4%  -12.77%  (p=0.008 n=5+5)

@XuHuaiyu

Copy link
Contributor

@SunRunAway SunRunAway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SunRunAway
Copy link
Contributor

/run-all-tests

@SunRunAway SunRunAway added the status/LGT1 Indicates that a PR has LGTM 1. label Oct 30, 2019
@SunRunAway
Copy link
Contributor

Sorry for the late reply.
As @SunRunAway mentioned above, the original test case must join the wide string columns, which result in a lot of string copies.
So I designed a new test case, as suggested by @SunRunAway, replace the string columns with double columns, and the result is much better:

➜  executor git:(vec-hash-probe) ✗ benchstat old.txt new.txt
name                                                                                                        old time/op  new time/op  delta
HashJoinExec/(rows:100000,_cols:[[bigint(20)_double],_concurency:4,_joinKeyIdx:_[0_1],_disk:false)-8         28.7ms ± 1%  23.5ms ± 1%  -18.10%  (p=0.008 n=5+5)
HashJoinExec/(rows:100000,_cols:[bigint(20)_double],_concurency:4,_joinKeyIdx:_[0],_disk:false)-8           26.2ms ± 3%  22.8ms ± 4%  -12.77%  (p=0.008 n=5+5)

@XuHuaiyu

Great job, you can post the newest benchmark result into the PR description.

@SunRunAway
Copy link
Contributor

/bench +tpch

@sre-bot
Copy link
Contributor

sre-bot commented Oct 30, 2019

Benchmark Report

Run TPC-H 10G Performance Test on VMs

@@                               Benchmark Diff                               @@
================================================================================
--- tidb: 08d26a3be1b88ccfd277b134f9f1687acab16a11
+++ tidb: f6d0b4df3ed569fdbe3606b8f428a5cac6d7a99c
tikv: a9c108185f79e2fd7dfbfcaad762b9413c99ce8a
pd: 17383d9ffdfd47baec03c6166642e7b137d19840
================================================================================
01.sql duration: 19842.79 ms ± 1.37% (std=272.12) delta: -5.22% (p=0.553)
02.sql duration: 7831.95 ms ± 1.83% (std=143.13) delta: -1.67% (p=0.623)
03.sql duration: 18403.03 ms ± 0.24% (std=43.40) delta: 0.44% (p=0.867)
04.sql duration: 8190.27 ms ± 2.16% (std=177.18) delta: 4.82% (p=0.237)
06.sql duration: 11367.20 ms ± 0.65% (std=74.13) delta: -4.30% (p=0.086)
07.sql duration: 18114.63 ms ± 2.23% (std=403.27) delta: -2.37% (p=0.467)
08.sql duration: 12945.07 ms ± 1.17% (std=151.50) delta: -3.43% (p=0.199)
09.sql duration: 31994.01 ms ± 0.49% (std=155.56) delta: -3.84% (p=0.123)
10.sql duration: 10331.48 ms ± 1.18% (std=122.01) delta: -3.25% (p=0.199)
11.sql duration: 3765.94 ms ± 3.92% (std=147.75) delta: 2.52% (p=0.644)
12.sql duration: 13707.02 ms ± 3.08% (std=422.02) delta: 0.89% (p=0.836)
13.sql duration: 9089.68 ms ± 1.88% (std=170.87) delta: 0.15% (p=0.955)
14.sql duration: 14037.29 ms ± 1.60% (std=224.35) delta: 2.24% (p=0.389)
15.sql duration: 23276.33 ms ± 0.69% (std=161.64) delta: -2.99% (p=0.142)
16.sql duration: 3486.62 ms ± 3.04% (std=106.01) delta: -7.43% (p=0.628)
17.sql duration: 37964.56 ms ± 1.16% (std=439.20) delta: 3.78% (p=0.181)
18.sql duration: 55353.45 ms ± 0.39% (std=216.71) delta: -0.73% (p=0.800)
19.sql duration: 17942.35 ms ± 0.99% (std=177.83) delta: 0.41% (p=0.837)
20.sql duration: 13066.79 ms ± 0.28% (std=37.24) delta: -5.57% (p=0.405)
21.sql duration: 36382.42 ms ± 0.45% (std=163.61) delta: 1.10% (p=0.207)
22.sql duration: 5004.16 ms ± 0.13% (std=6.65) delta: 0.81% (p=0.774)

@XuHuaiyu
Copy link
Contributor

04.sql duration: 8190.27 ms ± 2.16% (std=177.18) delta: 4.82% (p=0.237)

Is this expected?

@sduzh
Copy link
Contributor Author

sduzh commented Oct 31, 2019

04.sql duration: 8190.27 ms ± 2.16% (std=177.18) delta: 4.82% (p=0.237)

Is this expected?

No

Copy link
Contributor

@XuHuaiyu XuHuaiyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, please resolve the conflicts

@XuHuaiyu XuHuaiyu added the status/LGT2 Indicates that a PR has LGTM 2. label Nov 1, 2019
@XuHuaiyu XuHuaiyu removed the request for review from qw4990 November 1, 2019 07:35
@SunRunAway
Copy link
Contributor

/merge

@sre-bot sre-bot added the status/can-merge Indicates a PR has been approved by a committer. label Nov 5, 2019
@SunRunAway SunRunAway removed the status/LGT1 Indicates that a PR has LGTM 1. label Nov 5, 2019
@sre-bot
Copy link
Contributor

sre-bot commented Nov 5, 2019

/run-all-tests

@sre-bot
Copy link
Contributor

sre-bot commented Nov 5, 2019

@sduzh merge failed.

@SunRunAway
Copy link
Contributor

/merge

@sre-bot
Copy link
Contributor

sre-bot commented Nov 5, 2019

/run-all-tests

@sre-bot
Copy link
Contributor

sre-bot commented Nov 5, 2019

@sduzh merge failed.

@SunRunAway
Copy link
Contributor

/run-unit-test

@SunRunAway SunRunAway merged commit b697fac into pingcap:master Nov 5, 2019
@SunRunAway
Copy link
Contributor

Thank you, @sduzh.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contribution This PR is from a community contributor. sig/execution SIG execution status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Vectorize hash calculation in hashJoin.
6 participants