-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
transaction: add hook for async commit to track the life cycle of the async-commit goroutine and secondary lock cleanup goroutine #1432
transaction: add hook for async commit to track the life cycle of the async-commit goroutine and secondary lock cleanup goroutine #1432
Conversation
What about the non-async-commit 2PC, maybe such secondaries commit goroutines also need to be track in TiDB. |
TiDB has a mechanism to wait for all connections to finish the current executing statement/transactions (ref (Or did I have some misunderstandings about the transactions?) |
I took a look at In common 2pc transactions, the txn commit success is responded to user once the primary key is committed, and the secondaries are committed in background(note the backoffer is created from store's context, it's not related to user connection's context). client-go/txnkv/transaction/2pc.go Lines 1012 to 1022 in 41d133b
If the store is killed before the secondaries are commited, the prewrite locks of secondaries will be leave, need to be cleaned by other read transactions or GC. |
@you06 Oh! I got it. They'll also need to be tracked. I'll try to update this PR. |
dd4a539
to
623f6a4
Compare
623f6a4
to
d8e0b85
Compare
d8e0b85
to
208243d
Compare
/cc @you06 PTAL |
TiDB side PR: pingcap/tidb#55608 |
/retest |
208243d
to
391936c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more thing is the max backoff duration is 40s, I think the default graceful shutdown wait time should larger than it.
client-go/txnkv/transaction/2pc.go
Line 91 in 41d133b
CommitMaxBackoff = uint64(40000) |
@you06 Now, TiDB waits for at most 15s and it's hard coded and not configurable. It's a trade-off between the rolling-restart speed and the regression caused by leaking locks. Someone thinks it's better to rolling restart faster and they can accept a little regression, but someone may think it doesn't matter if the rolling restart takes longer time, but it shouldn't affect any metrics. At least, it's still not a big issue if some locks leak when these keys take too much time to commit 🤔. |
Agree. My point is that the default configuration is more understandable(and make sense) if the max graceful wait time is larger than commit backoff time. But it's not a big issue because if commit takes more than 15s, there must be a more serious issue rather than the unresolved locks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation LGTM
@cfzjywxk what's your opinion of the new added hooks.
/cc @cfzjywxk |
txnkv/transaction/2pc.go
Outdated
err = c.store.Go(func() { | ||
if c.txn.secondaryLockCleanupLifecycleHooks.Post != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid hook at all the goroutines, another approach is to hook at the beginning of execution
function in 2pc.go
and the defer block of it, using c.store.Waitgroup().wait()
to ensure the background goroutines are finished in the defer hook.
In the current implementation, the cleanup
goroutine needs to be hooked too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I've added a new spawn
method for the txn
as a helper function to track all goroutines. It'll maintain the lifecycle hooks and the inc/dec of txn.store.WaitGroup()
.
Also provide a spawnWithStorePool
to call txn.store.Go
internally. I don't modify the existing logic about whether to use the store pool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(It's called spawn
but not go
, because I want it to be a private function, but go
is a keyword so I cannot use it as a function name.)
Signed-off-by: Yang Keao <[email protected]>
391936c
to
ee7968b
Compare
/hold Let me do some refactor 😃 |
/unhold |
Signed-off-by: Yang Keao <[email protected]>
17787e9
to
c563d4a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cfzjywxk, you06 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This PR adds two hooks to execute before the start of async-commit goroutine and after the finish of async-commit goroutines.
It can be used to add some external logic to track the lifecycle of the async-commit goroutines. For example, in
TiDB
we'll need to wait for the background goroutines to finish before shutdown.