Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allocate continuous row id for single INSERT statement #13648

Merged
merged 11 commits into from
Nov 22, 2019

Conversation

Little-Wallace
Copy link
Contributor

@Little-Wallace Little-Wallace commented Nov 21, 2019

What problem does this PR solve?

If a table is created without primary key, TiDB will assign a unique RowID to every record and store it in TiKV. This PR will make the RowID of every statement be continuous.
The continuous keys are good for TiKV to improve write performance, by skip checking write version, see TiKV #5846 for more details.

For example, we run this sql in TiDB:

CREATE TABLE t1 (
  `k1` bigint(64) DEFAULT NULL,
  `k2` bigint(64) DEFAULT NULL,
  `v1` varchar(300) DEFAULT NULL,
  `v2` int(11) NOT NULL,
   KEY `k1`(`k1`)
);
thread1: INSERT INTO t1(k1,k2,v1,v2) VALUES(1,2,'a','a'),(2,2,'a','a');
thread2: INSERT INTO t1(k1,k2,v1,v2) VALUES(2,2,'a','a'),(3,2,'a','a');

Before the PR, they may get RowID as:

thread1:  1, 3;
thread2: 2, 4;

After the PR, they will get RowID as:

thread1:  1, 2;
thread2: 3, 4;

What is changed and how it works?

I will allocate a continuous numbers every time a statement try to get row id.

Check List

Tests

  • Unit test

Code changes

  • Has interface methods change

Side effects

  • Increased code complexity

Related changes

No

Release note

  • The record in the same statement would get continuous RowID. You could use SELECT _tidb_rowid for TABLE to ensure this feature.

Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
@CLAassistant
Copy link

CLAassistant commented Nov 21, 2019

CLA assistant check
All committers have signed the CLA.

@codecov
Copy link

codecov bot commented Nov 21, 2019

Codecov Report

Merging #13648 into master will not change coverage.
The diff coverage is n/a.

@@             Coverage Diff             @@
##             master     #13648   +/-   ##
===========================================
  Coverage   80.0389%   80.0389%           
===========================================
  Files           473        473           
  Lines        116507     116507           
===========================================
  Hits          93251      93251           
  Misses        15926      15926           
  Partials       7330       7330

@Little-Wallace Little-Wallace changed the title Batch row Allocate continuous row id for INSERT statement Nov 21, 2019
@zhangjinpeng87
Copy link
Contributor

Please describe the benefits for continuous row id.

return 0, err
}
}
stmtCtx.BaseRowID += 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should increase BaseRowID after L468?


// AllocHandle implements table.Table AllocHandle interface.
func (t *tableCommon) AllocHandleIDs(ctx sessionctx.Context, n uint64) (int64, int64, error) {
base, rowID, err := t.Allocator(ctx).Alloc(t.tableID, n)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/rowID/maxID

Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Copy link
Contributor

@zhangjinpeng87 zhangjinpeng87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhangjinpeng87 zhangjinpeng87 changed the title Allocate continuous row id for INSERT statement Allocate continuous row id for single INSERT statement Nov 21, 2019
@Little-Wallace
Copy link
Contributor Author

image

This PR could reduce more than 15% CPU of TiKV server when insert many keys.

The query insert 1 is sysbench prepare.
The query insert 2 is batch_insert of benchbot. See it at (https://github.com/pingcap/productivity-tools/blob/master/benchmark/batch_insert/batch_insert.lua#L75)

rows := stmtCtx.RecordRows()
if rows > 1 {
if stmtCtx.BaseRowID >= stmtCtx.MaxRowID {
stmtCtx.BaseRowID, stmtCtx.MaxRowID, err = t.AllocHandleIDs(ctx, rows)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MaxRowID is not used actually?

@sre-bot
Copy link
Contributor

sre-bot commented Nov 21, 2019

Benchmark Report

Run Sysbench Performance Test on VMs

@@                               Benchmark Diff                               @@
================================================================================
--- tidb: ce87b6c6713d551b3700ea9732cd0955a3819952
+++ tidb: fed318f81094296d48db98ed86d2053b2670414a
tikv: ae6038ea22271b23f2290e61e45838281ede4e20
pd: c71ec9500e49c933c198bba1f901a5b42a72de86
================================================================================
oltp_update_non_index:
    * QPS: 4723.07 ± 0.06% (std=1.99) delta: 0.20% (p=0.008)
    * Latency p50: 27.10 ± 0.06% (std=0.01) delta: -0.26%
    * Latency p99: 41.48 ± 0.90% (std=0.38) delta: -4.84%
            
oltp_insert:
    * QPS: 4775.71 ± 0.03% (std=0.93) delta: 0.21% (p=0.184)
    * Latency p50: 26.80 ± 0.02% (std=0.00) delta: -0.21%
    * Latency p99: 47.70 ± 2.23% (std=0.71) delta: -3.55%
            
oltp_read_write:
    * QPS: 15658.36 ± 0.69% (std=66.24) delta: 0.81% (p=0.039)
    * Latency p50: 163.84 ± 0.65% (std=0.66) delta: -0.78%
    * Latency p99: 303.45 ± 3.62% (std=8.64) delta: -1.34%
            
oltp_update_index:
    * QPS: 4253.34 ± 0.40% (std=12.51) delta: -0.09% (p=0.393)
    * Latency p50: 30.09 ± 0.40% (std=0.09) delta: 0.09%
    * Latency p99: 53.21 ± 2.39% (std=0.90) delta: -3.39%
            
oltp_point_select:
    * QPS: 40310.62 ± 0.22% (std=67.89) delta: 0.19% (p=0.527)
    * Latency p50: 3.17 ± 0.16% (std=0.01) delta: -0.16%
    * Latency p99: 9.73 ± 0.00% (std=0.00) delta: 1.78%
            

@Little-Wallace Little-Wallace added the status/can-merge Indicates a PR has been approved by a committer. label Nov 22, 2019
@sre-bot
Copy link
Contributor

sre-bot commented Nov 22, 2019

/run-all-tests

@sre-bot sre-bot merged commit 71e19a7 into pingcap:master Nov 22, 2019
cfzjywxk added a commit to cfzjywxk/tidb that referenced this pull request Dec 8, 2019
XiaTianliang pushed a commit to XiaTianliang/tidb that referenced this pull request Dec 21, 2019
}
tk.MustExec(sql)
q := "select _tidb_rowid from t1 where a=" + k
fmt.Printf("query: %v\n", q)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.......

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if err != nil {
return 0, err
stmtCtx := ctx.GetSessionVars().StmtCtx
rows := stmtCtx.RecordRows()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This number is wrong...
It's not the row counts in this statement, it's a batch of chunk rows and accumulative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/execution SIG execution status/can-merge Indicates a PR has been approved by a committer.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants