-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage/concurrency: release reqs from same txn as discovered lock #45601
storage/concurrency: release reqs from same txn as discovered lock #45601
Conversation
4f28cb6
to
2c7a4be
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r1.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten)
pkg/storage/concurrency/lock_table.go, line 1211 at r1 (raw file):
// If there are waiting requests from the same txn, they no longer need to wait. if l.releaseWritersFromTxn(txn) { informWaiters = true
I think we can tighten up the semantics of informWaiters
and remove the return value from releaseWritersFromTxn
.
The only correctness need for informWaiters=false
is when the lock is already held. Currently the code also sets it to false when l.reservation == nil
but that is an premature performance optimization that I think we should remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @sumeerbhola)
pkg/storage/concurrency/lock_table.go, line 1211 at r1 (raw file):
Previously, sumeerbhola wrote…
I think we can tighten up the semantics of
informWaiters
and remove the return value fromreleaseWritersFromTxn
.
The only correctness need forinformWaiters=false
is when the lock is already held. Currently the code also sets it to false whenl.reservation == nil
but that is an premature performance optimization that I think we should remove.
I'm not sure I understand why it's ever incorrect to call informActiveWaiters
. You said that there's a correctness need for informWaiters=false
when the lock is already held. Could you expand on that bit more?
Also, in place of the return value on releaseWritersFromTxn
, are you suggesting that it call tryMakeNewDistinguished
directly when it detects that doing so it necessary.
2c7a4be
to
ad50d9c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten and @sumeerbhola)
pkg/storage/concurrency/lock_table.go, line 1211 at r1 (raw file):
I'm not sure I understand why it's ever incorrect to call informActiveWaiters. You said that there's a correctness need for informWaiters=false when the lock is already held. Could you expand on that bit more?
I agree that from a correctness perspective it is ok to call informActiveWaiters
even if nothing has changed. The use of informWaiters
is trying to be a performance optimization. When l.reservation==nil
and the lock was not held this performance optimization is limited to not calling informActiveWaiters
since that function won't have any work to do -- there can't be any waiters. So we can remove it. Regarding setting informWaiters
to false when the lock was already held -- now that there are more fields in waitingState
, I don't think that optimization is correct since the waiters want to know whether they are waiting on a replicated or unreplicated lock. So that would mean removing informWaiters
altogether.
Also, in place of the return value on releaseWritersFromTxn, are you suggesting that it call tryMakeNewDistinguished directly when it detects that doing so it necessary.
No, we'd still call informActiveWaiters
here. I should have said that the return value from releaseWritersFromTxn
should not be needed to decide whether to call informActiveWaiters
.
ad50d9c
to
e8faf7c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @sumeerbhola)
pkg/storage/concurrency/lock_table.go, line 1211 at r1 (raw file):
Previously, sumeerbhola wrote…
I'm not sure I understand why it's ever incorrect to call informActiveWaiters. You said that there's a correctness need for informWaiters=false when the lock is already held. Could you expand on that bit more?
I agree that from a correctness perspective it is ok to call
informActiveWaiters
even if nothing has changed. The use ofinformWaiters
is trying to be a performance optimization. Whenl.reservation==nil
and the lock was not held this performance optimization is limited to not callinginformActiveWaiters
since that function won't have any work to do -- there can't be any waiters. So we can remove it. Regarding settinginformWaiters
to false when the lock was already held -- now that there are more fields inwaitingState
, I don't think that optimization is correct since the waiters want to know whether they are waiting on a replicated or unreplicated lock. So that would mean removinginformWaiters
altogether.Also, in place of the return value on releaseWritersFromTxn, are you suggesting that it call tryMakeNewDistinguished directly when it detects that doing so it necessary.
No, we'd still call
informActiveWaiters
here. I should have said that the return value fromreleaseWritersFromTxn
should not be needed to decide whether to callinformActiveWaiters
.
Done, I think. Mind double checking that that's the change you were thinking of?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r2.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @sumeerbhola)
This commit updates the AddDiscoveredLock state transition in lockTableImpl to release all waiting writers that are part of the same transaction as a discovered lock. This caused issues in cockroachdb#45482, where we saw txn deadlocks in the `pkg/sql/tests.Bank` benchmark. This issue was triggered because we were forgetting to inform the lockTable of a lock acquisition in a subtle case. That is now fixed and it's not clear that we can actually hit the bug here anymore given the current policy on how wait-queues form in the lock-table. Regardless, this is worth handling correctly in case things ever change.
e8faf7c
to
7381a72
Compare
bors r+ |
Build failed (retrying...) |
bors r+ |
Build failed (retrying...) |
Build succeeded |
This testdata file was added in cockroachdb#45601. That skewed with the package rename. Release justification: testing only
46631: kv/kvserver: mv testdata/lock_table/add_discovered out of pkg/storage r=nvanbenschoten a=nvanbenschoten This testdata file was added in #45601. That skewed with the mass package rename. Release justification: testing only Co-authored-by: Nathan VanBenschoten <[email protected]>
This commit updates the AddDiscoveredLock state transition in lockTableImpl to release all waiting writers that are part of the same transaction as a discovered lock.
This caused issues in #45482, where we saw txn deadlocks in the
pkg/sql/tests.Bank
benchmark. This issue was triggered because we were forgetting to inform the lockTable of a lock acquisition in a subtle case. That is now fixed and it's not clear that we can actually hit the bug here anymore given the current policy on how wait-queues form in the lock-table. Regardless, this is worth handling correctly in case things ever change.