-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
raftstore: retry pending read index requests #6348
Conversation
Signed-off-by: qupeng <[email protected]>
/run-all-tests |
Signed-off-by: qupeng <[email protected]>
/run-all-tests |
1 similar comment
/run-all-tests |
Signed-off-by: qupeng <[email protected]>
/run-all-tests |
Signed-off-by: qupeng <[email protected]>
e156a04
to
41ea574
Compare
/run-all-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: qupeng <[email protected]>
Signed-off-by: qupeng <[email protected]>
/run-all-tests |
/release |
Signed-off-by: qupeng <[email protected]>
Signed-off-by: qupeng <[email protected]>
src/raftstore/store/peer.rs
Outdated
@@ -1871,6 +1871,25 @@ impl Peer { | |||
Ok(()) | |||
} | |||
|
|||
/// `ReadIndex` requests could be lost in network, so on followers commands could queue in | |||
/// `pending_reads` forever. Sending a new `ReadIndex` periodically can resolve this. | |||
pub(super) fn retry_pending_reads(&mut self) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about put the if check here? And I think pub
is enough.
src/raftstore/store/read_queue.rs
Outdated
} | ||
} | ||
|
||
pub fn check_needs_retry(&mut self) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not pass configuration here instead? So that only one field is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So push_back
will also needs the configuration, which makes it more complex.
src/raftstore/store/read_queue.rs
Outdated
return false; | ||
} | ||
|
||
self.last_retried = total + self.handled_cnt - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why minus one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So last_retried
is an index, not a length.
src/raftstore/store/read_queue.rs
Outdated
} | ||
} | ||
|
||
pub fn check_needs_retry(&mut self) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should add some comment explaining the retry strategy.
Signed-off-by: qupeng <[email protected]>
Signed-off-by: qupeng <[email protected]>
src/raftstore/store/peer.rs
Outdated
@@ -191,7 +191,7 @@ pub struct Peer { | |||
|
|||
leader_missing_time: Option<Instant>, | |||
leader_lease: Lease, | |||
pending_reads: ReadIndexQueue, | |||
pub pending_reads: ReadIndexQueue, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary public.
src/raftstore/store/peer.rs
Outdated
} | ||
|
||
let read = self.pending_reads.back_mut().unwrap(); | ||
if read.read_index.is_none() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It must be none.
src/raftstore/store/read_queue.rs
Outdated
} | ||
|
||
if self.retry_countdown == usize::MAX { | ||
self.retry_countdown = cfg.raft_election_timeout_ticks.checked_sub(1).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Election timeout tick has to be greater than 0. count down can be zero after minusing one.
Signed-off-by: qupeng <[email protected]>
@5kbpers PTAL, I think it's completely different from what you has reviewed. |
I think it's completely different from what you has reviewed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/run-all-tests |
Signed-off-by: qupeng <[email protected]>
cherry pick to release-3.1 failed |
Signed-off-by: qupeng <[email protected]>
Signed-off-by: qupeng <[email protected]>
Signed-off-by: qupeng <[email protected]>
Signed-off-by: qupeng <[email protected]>
raftstore: retry pending read index requests (tikv#6348) (tikv#6543) Signed-off-by: qupeng <[email protected]> raft_client: limit batch size (tikv#6993) (tikv#7076) (tikv#7087) Signed-off-by: qupeng <[email protected]> backup raw kv (tikv#6308) (tikv#7051) Co-authored-by: xinhua5 <[email protected]> Co-authored-by: MyonKeminta <[email protected]> tikv_util: fix s3 writer always creating zeroes (tikv#6675) (tikv#6967) Signed-off-by: kennytm <[email protected]> Co-authored-by: Lei Zhao <[email protected]> backup: do fill in the file size of each SST file in the response (tikv#6664) (tikv#6983) Signed-off-by: kennytm <[email protected]> *: update raft to include aggressive flow control (tikv#7078) Signed-off-by: Jay Lee <[email protected]>
raftstore: retry pending read index requests (tikv#6348) (tikv#6543) Signed-off-by: qupeng <[email protected]> raft_client: limit batch size (tikv#6993) (tikv#7076) (tikv#7087) Signed-off-by: qupeng <[email protected]> backup raw kv (tikv#6308) (tikv#7051) Co-authored-by: xinhua5 <[email protected]> Co-authored-by: MyonKeminta <[email protected]> tikv_util: fix s3 writer always creating zeroes (tikv#6675) (tikv#6967) Signed-off-by: kennytm <[email protected]> Co-authored-by: Lei Zhao <[email protected]> backup: do fill in the file size of each SST file in the response (tikv#6664) (tikv#6983) Signed-off-by: kennytm <[email protected]> *: update raft to include aggressive flow control (tikv#7078) Signed-off-by: Jay Lee <[email protected]>
Signed-off-by: qupeng <[email protected]>
Signed-off-by: qupeng [email protected]
What have you changed?
On follwer replicas pending read index requests will be retried. So that read commands won't be blocked forever in raftstore.
What is the type of the changes?
Bugfix.
How is the PR tested?
Does this PR affect documentation (docs) or should it be mentioned in the release notes?
No.
Does this PR affect
tidb-ansible
?No.
Refer to a related PR or issue link (optional)
Benchmark result if necessary (optional)
Any examples? (optional)