ddl: pessimistic lock global id, alloc id & insert ddl job in one txn #54547

D3Hunter · 2024-07-10T08:52:00Z

What problem does this PR solve?

Issue Number: ref #54436

Problem Summary:

What changed and how does it work?

pessimistic lock global ID key before alloc it to avoid write conflict
job id allocation and job insertion are in the same transaction, as we want to make sure DDL jobs are inserted in id order, then we can query from a min job ID when scheduling DDL jobs to mitigate select very slow on an empty table from delete from xx #52905.
🟥 THIS PR hurts DDL performance, it requires another 2 PR to out perform current master: 1 to combine table id allocation with job id, and 2 to query since min job id. they will be filed later.

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)

when len(job_meta) < 8k, with [1, 1000] routines to run TestGenIDAndInsertJobsWithRetryQPS on 1pd/3tikv environment, QPS is about [300, 400], much larger than general DDL execution QPS

No need to test
- I checked and no code files have been changed.

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

tiprow · 2024-07-10T08:52:30Z

Hi @D3Hunter. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

codecov · 2024-07-10T09:18:04Z

Codecov Report

Attention: Patch coverage is 79.66102% with 24 lines in your changes missing coverage. Please review.

Project coverage is 55.9584%. Comparing base (0b9cd2f) to head (96ebfe6).
Report is 22 commits behind head on master.

Additional details and impacted files

@@                Coverage Diff                @@
##             master     #54547         +/-   ##
=================================================
- Coverage   74.7922%   55.9584%   -18.8339%     
=================================================
  Files          1549       1673        +124     
  Lines        362139     613652     +251513     
=================================================
+ Hits         270852     343390      +72538     
- Misses        71676     246719     +175043     
- Partials      19611      23543       +3932

Flag	Coverage Δ
integration	`37.1709% <54.2372%> (?)`
unit	`71.7187% <79.6610%> (-2.0006%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
dumpling	`52.9656% <ø> (-2.2339%)`	⬇️
parser	`∅ <ø> (∅)`
br	`52.5976% <ø> (+4.9234%)`	⬆️

D3Hunter · 2024-07-10T09:21:00Z

/hold

D3Hunter · 2024-07-10T10:15:36Z

/unhold

This reverts commit 1f8d67b.

D3Hunter · 2024-07-15T05:57:02Z

/hold

wait a another review from txn team

GMHDBJD

LGTM

ti-chi-bot · 2024-07-15T06:29:53Z

[LGTM Timeline notifier]

Timeline:

2024-07-12 09:42:14.273163825 +0000 UTC m=+1356.264105308: ☑️ agreed by lcwangchao.
2024-07-15 06:29:52.167155162 +0000 UTC m=+249014.158096633: ☑️ agreed by GMHDBJD.

lance6716 · 2024-07-15T06:34:48Z

pkg/ddl/ddl_worker.go

+// table with retry. job id allocation and job insertion are in the same transaction,
+// as we want to make sure DDL jobs are inserted in id order, then we can query from
+// a min job ID when scheduling DDL jobs to mitigate https://github.com/pingcap/tidb/issues/52905.
+// so this function has side effect, it will set the job id of 'tasks'.


Suggested change

// so this function has side effect, it will set the job id of 'tasks'.

// so this function has side effect, it will set the job id of 'jobs'.

lance6716 · 2024-07-15T06:36:44Z

pkg/ddl/ddl_worker.go

+// level in SQL executor, see doLockKeys.
+// TODO maybe we can unify the lock mechanism with SQL executor in the future, or
+// implement it inside TiKV client-go.
+func lockGlobalIDKey(ctx context.Context, ddlSe *sess.Session, txn kv.Transaction) (uint64, error) {


not a big problem. We can get txn from ddlSe

will keep it, don't want to get it again, this method is internal anyway.

lance6716 · 2024-07-15T06:43:11Z

pkg/ddl/ddl_worker.go

+		ver         kv.Version
+		err         error
+	)
+	waitTime := ddlSe.GetSessionVars().LockWaitTimeout


this session (ddlSe) is get from session pool, not user's session. Maybe get the lock wait value from user's session.

this is part of internal DDL execution, internal not user setting should be used

Not sure if it's more reasonable that the setting of DDL's SQL connection can affect the timeout of DDL.

seems only affects txn for DML in mysql, mysql don't have a ddl_job table like us, so it shouldn't affect DDL, a simple memory lock should be enough
https://dev.mysql.com/doc/refman/8.4/en/innodb-parameters.html#sysvar_innodb_lock_wait_timeout

lance6716

will review soon

lance6716 · 2024-07-15T06:55:48Z

pkg/ddl/ddl_worker.go

+			return ddlSe.Commit(ctx)
+		}()
+
+		if resErr != nil && kv.IsTxnRetryableError(resErr) {


not sure what's the meaning of ErrLockExpire. I just want to cover the case that pessimistic lock is cleaned by other readers of GlobalIDKey and it can be retryable.

see https://github.com/tikv/client-go/blob/d73cc1ed6503925dfc7226e8d5677ceb4c2fd6f1/txnkv/transaction/2pc.go#L1227-L1230, and handled in tidb with below, seems no special handling

tidb/pkg/session/session.go

Lines 1029 to 1040 in 06e0e17

func (s *session) checkTxnAborted(stmt sqlexec.Statement) error {

if atomic.LoadUint32(&s.GetSessionVars().TxnCtx.LockExpire) == 0 {

return nil

}

// If the transaction is aborted, the following statements do not need to execute, except `commit` and `rollback`,

// because they are used to finish the aborted transaction.

if ok, err := isEndTxnStmt(stmt.(*executor.ExecStmt).StmtNode, s.sessionVars); err == nil && ok {

return nil

} else if err != nil {

return err

}

return kv.ErrLockExpire

i guess the reason is: T1 lock it so no one can change it before forUpdateTS, after the lock is expired & it hasn't commit, if there is T2 lock it later, T1 commit > T2 forUpdateTS > T1 forUpdateTS, one of them will report write conflict on commit

I found this type of error ERROR 1105 (HY000): tikv aborts txn: Error(Txn(Error(Mvcc(Error(PessimisticLockNotFound which likely standing for the case that pessimistic lock is cleaned by other transactions. But I don't know if it's handled by kv.IsTxnRetryableError. Transaction is very complex 😂

But I don't know if it's handled by kv.IsTxnRetryableError

tidb/pkg/kv/error.go

Lines 83 to 85 in 9044acb

if ErrTxnRetryable.Equal(err) || ErrWriteConflict.Equal(err) || ErrWriteConflictInTiDB.Equal(err) {

return true

}

only ErrTxnRetryable except write conflict

Transaction is very complex 😂

yes, that's why i ask @lcwangchao @cfzjywxk to review this part

default MaxTxnTTL: 60 * 60 * 1000, // 1hour, seems ok to let user retry it themself in this case, even DML transaction will abort

I see ManagedLockTTL in client-go is 20s. Does it means if pessimistic lock left in TiKV while the node crashes, other node must wait at most 20s to clean up the lock? If lock owner crashes, submitting DDL will be paused for 20s, it's not friendly.

seems we cannot avoid this on crash for pessimistic txn now

create table t(id int primary key, v int); insert into t values(1,1);

run this

mysql -uroot -h 127.0.0.1 -P4000 test -e "select now(); begin; update t set v=2 where id=1; select sleep(100);"; +---------------------+ | now() | +---------------------+ | 2024-07-15 22:26:12 | +---------------------+ ERROR 2013 (HY000) at line 1: Lost connection to MySQL server during query

kill -9 tidb, immediately after previous step

restart

run immediately after resart

mysql -uroot -h 127.0.0.1 -P4000 test -e "select now(); begin; update t set v=3 where id=1; commit; select now();" +---------------------+ | now() | +---------------------+ | 2024-07-15 22:26:16 | +---------------------+ +---------------------+ | now() | +---------------------+ | 2024-07-15 22:26:32 | +---------------------+

lance6716 · 2024-07-15T06:57:36Z

pkg/ddl/ddl_worker.go

+	ddlSe := sess.NewSession(se)
+	localMode := tasks[0].job.LocalMode
+	if localMode {
+		if err = fillJobIDs(ctx, ddlSe, tasks); err != nil {


local mode jobs are not written to ddl_job table, no need to lock the global ID? I'm afraid the performance for local job is more important.

no need to lock the global ID?

memory lock is used to reduce write conflict, after later pr to "combine table id allocation with job id", there will be no write conflict in 1 node.

with the 2 pr mentioned in the pr, we can create 100k tables in about 13 minutes, and there is very little speed degradation, i will test 1M tables later, if everything goes ok, i still suggest deprecate fast-create in later version

lance6716

rest lgtm

pkg/ddl/ddl_worker.go

cfzjywxk · 2024-07-15T07:23:20Z

pkg/ddl/ddl_worker.go

+// generate ID and call function runs in the same transaction.
+func genIDAndCallWithRetry(ctx context.Context, ddlSe *sess.Session, count int, fn func(ids []int64) error) error {
+	var resErr error
+	for i := uint(0); i < kv.MaxRetryCnt; i++ {


Not that much transaction retry is needed, as the conflict error is already handled by the in-transaction statements or operations.

it's reusing retry count of kv.RunInNewTxn, in case of other retryable error

The kv.RunInNewTxn mode is executed in optimistic mode, the write conflict error needs to be handled doing commit. The transaction commit should succeed in most cases when pessimistic lock is used. Not a big problem to try more times.

pkg/ddl/ddl_worker.go

cfzjywxk · 2024-07-16T07:08:14Z

pkg/ddl/ddl_worker.go

+// generate ID and call function runs in the same transaction.
+func genIDAndCallWithRetry(ctx context.Context, ddlSe *sess.Session, count int, fn func(ids []int64) error) error {
+	var resErr error
+	for i := uint(0); i < kv.MaxRetryCnt; i++ {


The kv.RunInNewTxn mode is executed in optimistic mode, the write conflict error needs to be handled doing commit. The transaction commit should succeed in most cases when pessimistic lock is used. Not a big problem to try more times.

cfzjywxk · 2024-07-16T07:10:01Z

pkg/ddl/ddl_worker.go

+				return errors.Trace(err)
+			}
+			defer func() {
+				if err != nil {


Maybe it's better to abstract L549 to L576 to a standalone function, using defer statement inside for loop increases the risks of mistakes.

it's inside an lambda function already, the defer will run after the it returns.

cfzjywxk

LGTM

D3Hunter · 2024-07-16T09:02:10Z

/unhold

tangenta · 2024-07-16T09:17:31Z

pkg/ddl/ddl_worker.go

+	var resErr error
+	for i := uint(0); i < kv.MaxRetryCnt; i++ {
+		resErr = func() (err error) {
+			if err := ddlSe.Begin(ctx); err != nil {


If we Begin() here, previous statements executed by ddlSe will be committed implicitly. For example, the flashback jobs checking maybe outdated.

If we Begin() here, previous statements executed by ddlSe will be committed implicitly.

what do you mean, we don't have transaction before this one

flashback jobs checking maybe outdated

it's the same behavior as our previous impl. if such job submitted, job scheduler will stop it from running if there is a flashback job before

i ever thought remove this checking, as job schedule will calculate dependency between jobs, but a little different with current behavior

it's the same behavior as our previous impl.

OK.

job scheduler will stop it from running

I guess the purpose of checking flashback job before submitting is to prevent having potential wrong info in job.Args.

previously we cannot make sure there is no flashback job before this job, they might be submitted on different node

now we can do it by check inside the submit transaction, but i don't want to change this part now, and it might hurt performance

ti-chi-bot · 2024-07-16T10:01:27Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cfzjywxk, GMHDBJD, lcwangchao, tangenta

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [GMHDBJD,cfzjywxk,lcwangchao,tangenta]
~~pkg/ddl/OWNERS~~ [tangenta]
~~pkg/meta/OWNERS~~ [GMHDBJD,tangenta]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

D3Hunter · 2024-07-16T11:10:56Z

/retest

tiprow · 2024-07-16T11:11:18Z

@D3Hunter: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

D3Hunter · 2024-07-16T11:15:24Z

/retest

tiprow · 2024-07-16T11:15:45Z

@D3Hunter: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

D3Hunter · 2024-07-16T12:00:36Z

/retest

tiprow · 2024-07-16T12:00:58Z

@D3Hunter: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

D3Hunter added 8 commits July 9, 2024 17:33

change

e031e76

test

eed5c81

change

9031205

change

dc5f40a

change

4de6ccd

change

2ad8272

change

4a03cdb

bazel

940c1e1

ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jul 10, 2024

ti-chi-bot bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 10, 2024

D3Hunter changed the title ~~ddl: alloc id & insert ddl job in one txn~~ [WIP]ddl: alloc id & insert ddl job in one txn Jul 10, 2024

ti-chi-bot bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 10, 2024

tmp: merge alloc id

1f8d67b

ti-chi-bot bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 10, 2024

D3Hunter added 4 commits July 12, 2024 15:45

Merge remote-tracking branch 'upstream/master' into ensure-job-id-order

22f7bac

Revert "tmp: merge alloc id"

9ed43ad

This reverts commit 1f8d67b.

change

20f3b2d

change

e5bd14b

D3Hunter changed the title ~~[WIP]ddl: alloc id & insert ddl job in one txn~~ ddl: alloc id & insert ddl job in one txn Jul 12, 2024

ti-chi-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 12, 2024

D3Hunter changed the title ~~ddl: alloc id & insert ddl job in one txn~~ ddl: pessimistic lock global id, alloc id & insert ddl job in one txn Jul 12, 2024

D3Hunter requested review from lance6716, tangenta and lcwangchao July 12, 2024 08:55

fix test

db08387

D3Hunter mentioned this pull request Jul 14, 2024

ddl code refactor/optimize #54436

Open

54 tasks

ti-chi-bot bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 15, 2024

GMHDBJD approved these changes Jul 15, 2024

View reviewed changes

ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Jul 15, 2024

lance6716 reviewed Jul 15, 2024

View reviewed changes

cfzjywxk reviewed Jul 15, 2024

View reviewed changes

add comments

96ebfe6

cfzjywxk reviewed Jul 16, 2024

View reviewed changes

cfzjywxk approved these changes Jul 16, 2024

View reviewed changes

ti-chi-bot bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 16, 2024

tangenta reviewed Jul 16, 2024

View reviewed changes

tangenta approved these changes Jul 16, 2024

View reviewed changes

ti-chi-bot bot added the approved label Jul 16, 2024

ti-chi-bot bot merged commit de54620 into pingcap:master Jul 16, 2024
21 checks passed

D3Hunter deleted the ensure-job-id-order branch July 16, 2024 12:21

This was referenced Jul 16, 2024

ddl: combine table/partition/db id allocation with job id #54669

Merged

ddl: query since min job id #54693

Merged

	// so this function has side effect, it will set the job id of 'tasks'.
	// so this function has side effect, it will set the job id of 'jobs'.

	func (s *session) checkTxnAborted(stmt sqlexec.Statement) error {
	if atomic.LoadUint32(&s.GetSessionVars().TxnCtx.LockExpire) == 0 {
	return nil
	}
	// If the transaction is aborted, the following statements do not need to execute, except `commit` and `rollback`,
	// because they are used to finish the aborted transaction.
	if ok, err := isEndTxnStmt(stmt.(*executor.ExecStmt).StmtNode, s.sessionVars); err == nil && ok {
	return nil
	} else if err != nil {
	return err
	}
	return kv.ErrLockExpire

	if ErrTxnRetryable.Equal(err) \|\| ErrWriteConflict.Equal(err) \|\| ErrWriteConflictInTiDB.Equal(err) {
	return true
	}

ddl: pessimistic lock global id, alloc id & insert ddl job in one txn #54547

ddl: pessimistic lock global id, alloc id & insert ddl job in one txn #54547

Conversation

D3Hunter commented Jul 10, 2024 • edited Loading

What problem does this PR solve?

What changed and how does it work?

Check List

Release note

tiprow bot commented Jul 10, 2024

codecov bot commented Jul 10, 2024 • edited Loading

Codecov Report

D3Hunter commented Jul 10, 2024

D3Hunter commented Jul 10, 2024

D3Hunter commented Jul 15, 2024

GMHDBJD left a comment

Choose a reason for hiding this comment

ti-chi-bot bot commented Jul 15, 2024

[LGTM Timeline notifier]

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lance6716 Jul 15, 2024 • edited Loading

Choose a reason for hiding this comment

D3Hunter Jul 15, 2024 • edited Loading

Choose a reason for hiding this comment

lance6716 left a comment

Choose a reason for hiding this comment

lance6716 Jul 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lance6716 Jul 15, 2024 • edited Loading

Choose a reason for hiding this comment

D3Hunter Jul 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lance6716 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cfzjywxk left a comment

Choose a reason for hiding this comment

D3Hunter commented Jul 16, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ti-chi-bot bot commented Jul 16, 2024

D3Hunter commented Jul 16, 2024

tiprow bot commented Jul 16, 2024

D3Hunter commented Jul 16, 2024

tiprow bot commented Jul 16, 2024

D3Hunter commented Jul 16, 2024

tiprow bot commented Jul 16, 2024

D3Hunter commented Jul 10, 2024 •

edited

Loading

codecov bot commented Jul 10, 2024 •

edited

Loading

lance6716 Jul 15, 2024 •

edited

Loading

D3Hunter Jul 15, 2024 •

edited

Loading

lance6716 Jul 15, 2024 •

edited

Loading

lance6716 Jul 15, 2024 •

edited

Loading

D3Hunter Jul 15, 2024 •

edited

Loading