-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ddl: pessimistic lock global id, alloc id & insert ddl job in one txn #54547
Conversation
Hi @D3Hunter. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #54547 +/- ##
=================================================
- Coverage 74.7922% 55.9584% -18.8339%
=================================================
Files 1549 1673 +124
Lines 362139 613652 +251513
=================================================
+ Hits 270852 343390 +72538
- Misses 71676 246719 +175043
- Partials 19611 23543 +3932
Flags with carried forward coverage won't be shown. Click here to find out more.
|
/hold |
/unhold |
/hold wait a another review from txn team |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
[LGTM Timeline notifier]Timeline:
|
pkg/ddl/ddl_worker.go
Outdated
// table with retry. job id allocation and job insertion are in the same transaction, | ||
// as we want to make sure DDL jobs are inserted in id order, then we can query from | ||
// a min job ID when scheduling DDL jobs to mitigate https://github.com/pingcap/tidb/issues/52905. | ||
// so this function has side effect, it will set the job id of 'tasks'. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// so this function has side effect, it will set the job id of 'tasks'. | |
// so this function has side effect, it will set the job id of 'jobs'. |
// level in SQL executor, see doLockKeys. | ||
// TODO maybe we can unify the lock mechanism with SQL executor in the future, or | ||
// implement it inside TiKV client-go. | ||
func lockGlobalIDKey(ctx context.Context, ddlSe *sess.Session, txn kv.Transaction) (uint64, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not a big problem. We can get txn
from ddlSe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will keep it, don't want to get it again, this method is internal anyway.
ver kv.Version | ||
err error | ||
) | ||
waitTime := ddlSe.GetSessionVars().LockWaitTimeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this session (ddlSe) is get from session pool, not user's session. Maybe get the lock wait value from user's session.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is part of internal DDL execution, internal not user setting should be used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it's more reasonable that the setting of DDL's SQL connection can affect the timeout of DDL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems only affects txn for DML in mysql, mysql don't have a ddl_job table like us, so it shouldn't affect DDL, a simple memory lock should be enough
https://dev.mysql.com/doc/refman/8.4/en/innodb-parameters.html#sysvar_innodb_lock_wait_timeout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will review soon
return ddlSe.Commit(ctx) | ||
}() | ||
|
||
if resErr != nil && kv.IsTxnRetryableError(resErr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what's the meaning of ErrLockExpire
. I just want to cover the case that pessimistic lock is cleaned by other readers of GlobalIDKey and it can be retryable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see https://github.com/tikv/client-go/blob/d73cc1ed6503925dfc7226e8d5677ceb4c2fd6f1/txnkv/transaction/2pc.go#L1227-L1230, and handled in tidb with below, seems no special handling
Lines 1029 to 1040 in 06e0e17
func (s *session) checkTxnAborted(stmt sqlexec.Statement) error { | |
if atomic.LoadUint32(&s.GetSessionVars().TxnCtx.LockExpire) == 0 { | |
return nil | |
} | |
// If the transaction is aborted, the following statements do not need to execute, except `commit` and `rollback`, | |
// because they are used to finish the aborted transaction. | |
if ok, err := isEndTxnStmt(stmt.(*executor.ExecStmt).StmtNode, s.sessionVars); err == nil && ok { | |
return nil | |
} else if err != nil { | |
return err | |
} | |
return kv.ErrLockExpire |
i guess the reason is: T1 lock it so no one can change it before forUpdateTS, after the lock is expired & it hasn't commit, if there is T2 lock it later, T1 commit > T2 forUpdateTS > T1 forUpdateTS, one of them will report write conflict
on commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this type of error ERROR 1105 (HY000): tikv aborts txn: Error(Txn(Error(Mvcc(Error(PessimisticLockNotFound
which likely standing for the case that pessimistic lock is cleaned by other transactions. But I don't know if it's handled by kv.IsTxnRetryableError
. Transaction is very complex 😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I don't know if it's handled by kv.IsTxnRetryableError
Lines 83 to 85 in 9044acb
if ErrTxnRetryable.Equal(err) || ErrWriteConflict.Equal(err) || ErrWriteConflictInTiDB.Equal(err) { | |
return true | |
} |
only ErrTxnRetryable except write conflict
Transaction is very complex 😂
yes, that's why i ask @lcwangchao @cfzjywxk to review this part
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default MaxTxnTTL: 60 * 60 * 1000, // 1hour
, seems ok to let user retry it themself in this case, even DML transaction will abort
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see ManagedLockTTL
in client-go is 20s. Does it means if pessimistic lock left in TiKV while the node crashes, other node must wait at most 20s to clean up the lock? If lock owner crashes, submitting DDL will be paused for 20s, it's not friendly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems we cannot avoid this on crash for pessimistic txn now
- create table t(id int primary key, v int); insert into t values(1,1);
- run this
mysql -uroot -h 127.0.0.1 -P4000 test -e "select now(); begin; update t set v=2 where id=1; select sleep(100);";
+---------------------+
| now() |
+---------------------+
| 2024-07-15 22:26:12 |
+---------------------+
ERROR 2013 (HY000) at line 1: Lost connection to MySQL server during query
- kill -9 tidb, immediately after previous step
- restart
- run immediately after resart
mysql -uroot -h 127.0.0.1 -P4000 test -e "select now(); begin; update t set v=3 where id=1; commit; select now();"
+---------------------+
| now() |
+---------------------+
| 2024-07-15 22:26:16 |
+---------------------+
+---------------------+
| now() |
+---------------------+
| 2024-07-15 22:26:32 |
+---------------------+
ddlSe := sess.NewSession(se) | ||
localMode := tasks[0].job.LocalMode | ||
if localMode { | ||
if err = fillJobIDs(ctx, ddlSe, tasks); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
local mode jobs are not written to ddl_job table, no need to lock the global ID? I'm afraid the performance for local job is more important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to lock the global ID?
memory lock is used to reduce write conflict, after later pr to "combine table id allocation with job id", there will be no write conflict in 1 node.
with the 2 pr mentioned in the pr, we can create 100k tables in about 13 minutes, and there is very little speed degradation, i will test 1M tables later, if everything goes ok, i still suggest deprecate fast-create in later version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest lgtm
// generate ID and call function runs in the same transaction. | ||
func genIDAndCallWithRetry(ctx context.Context, ddlSe *sess.Session, count int, fn func(ids []int64) error) error { | ||
var resErr error | ||
for i := uint(0); i < kv.MaxRetryCnt; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that much transaction retry is needed, as the conflict error is already handled by the in-transaction statements or operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's reusing retry count of kv.RunInNewTxn
, in case of other retryable error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The kv.RunInNewTxn
mode is executed in optimistic mode, the write conflict error needs to be handled doing commit. The transaction commit should succeed in most cases when pessimistic lock is used. Not a big problem to try more times.
// generate ID and call function runs in the same transaction. | ||
func genIDAndCallWithRetry(ctx context.Context, ddlSe *sess.Session, count int, fn func(ids []int64) error) error { | ||
var resErr error | ||
for i := uint(0); i < kv.MaxRetryCnt; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The kv.RunInNewTxn
mode is executed in optimistic mode, the write conflict error needs to be handled doing commit. The transaction commit should succeed in most cases when pessimistic lock is used. Not a big problem to try more times.
return errors.Trace(err) | ||
} | ||
defer func() { | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's better to abstract L549 to L576 to a standalone function, using defer statement inside for loop increases the risks of mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's inside an lambda function already, the defer will run after the it returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/unhold |
var resErr error | ||
for i := uint(0); i < kv.MaxRetryCnt; i++ { | ||
resErr = func() (err error) { | ||
if err := ddlSe.Begin(ctx); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we Begin()
here, previous statements executed by ddlSe
will be committed implicitly. For example, the flashback jobs checking maybe outdated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we Begin() here, previous statements executed by ddlSe will be committed implicitly.
what do you mean, we don't have transaction before this one
flashback jobs checking maybe outdated
it's the same behavior as our previous impl. if such job submitted, job scheduler will stop it from running if there is a flashback job before
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i ever thought remove this checking, as job schedule will calculate dependency between jobs, but a little different with current behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's the same behavior as our previous impl.
OK.
job scheduler will stop it from running
I guess the purpose of checking flashback job before submitting is to prevent having potential wrong info in job.Args
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
previously we cannot make sure there is no flashback job before this job, they might be submitted on different node
now we can do it by check inside the submit transaction, but i don't want to change this part now, and it might hurt performance
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cfzjywxk, GMHDBJD, lcwangchao, tangenta The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
@D3Hunter: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/retest |
@D3Hunter: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/retest |
@D3Hunter: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
What problem does this PR solve?
Issue Number: ref #54436
Problem Summary:
What changed and how does it work?
select
very slow on an empty table fromdelete from xx
#52905.Check List
Tests
when len(job_meta) < 8k, with [1, 1000] routines to run TestGenIDAndInsertJobsWithRetryQPS on 1pd/3tikv environment, QPS is about [300, 400], much larger than general DDL execution QPS
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.