-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
store/tikv,executor: redesign the latch scheduler #7711
Conversation
Check maxCommitTS on each key, instead of each slot, so hash collision will not lead to transaction retry.
store/tikv/latch/latch.go
Outdated
find = n | ||
break | ||
} | ||
// TODO: Invalidate old data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot merge this PR before fixing this TODO. Because without it we will get an OOM.
|
||
func (l *latch) isEmpty() bool { | ||
return l.waitingQueueHead == 0 && !l.hasMoreWaiting | ||
next *node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use list.List
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
list.List
will make unnecessary allocation.
Use
type Element struct {
// The value stored with this element.
Value interface{}
// contains filtered or unexported fields
}
is similar to
type node struct {
Value *nodeValue
}
type nodeValue {
slotID int
key []byte
maxCommitTS uint64
value *Lock
}
list.List
is a doubly linked list, while a single linked list is sufficient here.- list data struct is simple and common enough to implement
store/tikv/latch/latch.go
Outdated
|
||
// Try to limits the memory usage. | ||
if latch.count > latchListCount { | ||
latch.recycle(lock.startTS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should recycle
be moved into if find == nil {}
?
store/tikv/latch/latch.go
Outdated
for i := 0; i < len(latch.waiting); i++ { | ||
waiting := latch.waiting[i] | ||
if bytes.Compare(waiting.keys[waiting.acquiredCount], key) == 0 { | ||
nextLock = waiting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that there are more than 1 Lock
s in the waiting list have the same key?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible! you find a bug.
I should only wake up the first one.
Wake up the first one is in a FIFO manner, there are still room for improvement here to choose which one to wake up.
Do we have any test/benchmark result? |
PTAL @zhangjinpeng1987 |
I use a
key range of When latch scheduler is disabled:
When latch scheduler is enabled:
There will be near 100x performance improment. |
store/tikv/latch/scheduler.go
Outdated
@@ -40,13 +44,30 @@ func NewScheduler(size uint) *LatchesScheduler { | |||
return scheduler | |||
} | |||
|
|||
const checkInterval = 10 * time.Minute | |||
const expireDuration = 2 * time.Hour |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too long. Be aware that now a transaction can last for at most 10 minutes. So we should be safe to clean up commit history before 10 minutes.
store/tikv/latch/latch.go
Outdated
|
||
// Handle the head node. | ||
if tsoSub(currentTS, l.queue.maxCommitTS) >= expireDuration && l.queue.value == nil { | ||
l.queue = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No count--?
store/tikv/latch/latch.go
Outdated
if tsoSub(currentTS, l.queue.maxCommitTS) >= expireDuration && l.queue.value == nil { | ||
l.queue = nil | ||
} | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need return
store/tikv/latch/latch.go
Outdated
} | ||
|
||
prev := l.queue | ||
curr := l.queue.next |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better initialize as prev = nil, curr = l.queue
, so don't have to handle head node separately. Or you can consider using the indirect pointer hack suggested by Linus Torvalds https://medium.com/@bartobri/applying-the-linus-tarvolds-good-taste-coding-requirement-99749f37684a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I found that trick myself before I saw any article about it. (I remember I wrote a blog post http://www.zenlife.tk/fake-list-head.md around that time)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Admire you.
LGTM. |
type latch struct { | ||
queue *node | ||
count int | ||
waiting []*Lock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not each node has a waiting queue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Waiting queue is moved from each node to the latch for those reasons:
- nodes in the queue is inserted every now and then, if each node has is waiting queue, the queue would be created and destroyed. There will be more allocations, and it's less memory efficient.
- I have an assumption that the waiting queue would not be large, when a list is small enough, an array is very efficient.
- You may still remember the "first waiting one automatically become running" problem and the old code is complex enough to handle different states. If each node doesn't have the waiting queue, the problem could be avoid.
@zhangjinpeng1987
if idx < len(latch.waiting) { | ||
nextLock = latch.waiting[idx] | ||
// Delete element latch.waiting[idx] from the array. | ||
copy(latch.waiting[idx:], latch.waiting[idx+1:]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to use a list for waiting locks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@coocood What's your advise?
@tiancaiamao Have you tested sysbench OLTP_RW-pareto scenario? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/run-all-tests |
Check maxCommitTS on each key, instead of each slot, so hash collision will not lead to transaction retry.
What problem does this PR solve?
The old latch scheduler checks
maxCommitTS
on each slot, different key could hash to the same slot,then the latch scheduler will report a fake transaction confliction.
What is changed and how it works?
Each key will have their own
maxCommitTS
, so hash collision will not result in a fake transaction confliction.Check List
Tests
Code changes
RefreshCommitTS
PTAL @coocood