Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raftstore: retry pending read index requests #6348

Merged
merged 49 commits into from
Jan 10, 2020
Merged

Conversation

hicqu
Copy link
Contributor

@hicqu hicqu commented Dec 27, 2019

Signed-off-by: qupeng [email protected]

What have you changed?

On follwer replicas pending read index requests will be retried. So that read commands won't be blocked forever in raftstore.

What is the type of the changes?

Bugfix.

How is the PR tested?

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Does this PR affect documentation (docs) or should it be mentioned in the release notes?

No.

Does this PR affect tidb-ansible?

No.

Refer to a related PR or issue link (optional)

Benchmark result if necessary (optional)

Any examples? (optional)

@hicqu hicqu added sig/raft Component: Raft, RaftStore, etc. type/bugfix This PR fixes a bug. needs-cherry-pick-release-3.1 Type: Need cherry pick to release 3.1 labels Dec 27, 2019
@hicqu
Copy link
Contributor Author

hicqu commented Dec 27, 2019

/run-all-tests

Signed-off-by: qupeng <[email protected]>
@hicqu
Copy link
Contributor Author

hicqu commented Dec 27, 2019

/run-all-tests

1 similar comment
@hicqu
Copy link
Contributor Author

hicqu commented Dec 27, 2019

/run-all-tests

@hicqu
Copy link
Contributor Author

hicqu commented Dec 30, 2019

/run-all-tests

Signed-off-by: qupeng <[email protected]>
@hicqu
Copy link
Contributor Author

hicqu commented Dec 30, 2019

/run-all-tests

5kbpers
5kbpers previously approved these changes Dec 30, 2019
Copy link
Member

@5kbpers 5kbpers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hicqu
Copy link
Contributor Author

hicqu commented Dec 31, 2019

/run-all-tests

@hicqu
Copy link
Contributor Author

hicqu commented Jan 2, 2020

/release

@@ -1871,6 +1871,25 @@ impl Peer {
Ok(())
}

/// `ReadIndex` requests could be lost in network, so on followers commands could queue in
/// `pending_reads` forever. Sending a new `ReadIndex` periodically can resolve this.
pub(super) fn retry_pending_reads(&mut self) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about put the if check here? And I think pub is enough.

}
}

pub fn check_needs_retry(&mut self) -> bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not pass configuration here instead? So that only one field is necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So push_back will also needs the configuration, which makes it more complex.

return false;
}

self.last_retried = total + self.handled_cnt - 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why minus one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So last_retried is an index, not a length.

}
}

pub fn check_needs_retry(&mut self) -> bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should add some comment explaining the retry strategy.

@@ -191,7 +191,7 @@ pub struct Peer {

leader_missing_time: Option<Instant>,
leader_lease: Lease,
pending_reads: ReadIndexQueue,
pub pending_reads: ReadIndexQueue,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary public.

}

let read = self.pending_reads.back_mut().unwrap();
if read.read_index.is_none() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It must be none.

}

if self.retry_countdown == usize::MAX {
self.retry_countdown = cfg.raft_election_timeout_ticks.checked_sub(1).unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Election timeout tick has to be greater than 0. count down can be zero after minusing one.

@BusyJay
Copy link
Member

BusyJay commented Jan 9, 2020

@5kbpers PTAL, I think it's completely different from what you has reviewed.

@BusyJay BusyJay dismissed 5kbpers’s stale review January 9, 2020 13:02

I think it's completely different from what you has reviewed.

Copy link
Member

@5kbpers 5kbpers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hicqu
Copy link
Contributor Author

hicqu commented Jan 10, 2020

/run-all-tests

Signed-off-by: qupeng <[email protected]>
@hicqu hicqu merged commit 76866ae into tikv:master Jan 10, 2020
@hicqu hicqu deleted the read-index-retry branch January 10, 2020 05:40
@sre-bot
Copy link
Contributor

sre-bot commented Jan 10, 2020

cherry pick to release-3.1 failed

hicqu added a commit to hicqu/tikv that referenced this pull request Feb 7, 2020
hicqu added a commit to hicqu/tikv that referenced this pull request Feb 7, 2020
youjiali1995 pushed a commit that referenced this pull request Mar 12, 2020
solotzg pushed a commit to pingcap/tidb-engine-ext that referenced this pull request Mar 13, 2020
solotzg pushed a commit to pingcap/tidb-engine-ext that referenced this pull request Mar 13, 2020
raftstore: retry pending read index requests (tikv#6348) (tikv#6543)

Signed-off-by: qupeng <[email protected]>

raft_client: limit batch size (tikv#6993) (tikv#7076) (tikv#7087)

Signed-off-by: qupeng <[email protected]>

backup raw kv (tikv#6308) (tikv#7051)

Co-authored-by: xinhua5 <[email protected]>
Co-authored-by: MyonKeminta <[email protected]>

tikv_util: fix s3 writer always creating zeroes (tikv#6675) (tikv#6967)

Signed-off-by: kennytm <[email protected]>

Co-authored-by: Lei Zhao <[email protected]>

 backup: do fill in the file size of each SST file in the response (tikv#6664) (tikv#6983)

Signed-off-by: kennytm <[email protected]>

*: update raft to include aggressive flow control (tikv#7078)

Signed-off-by: Jay Lee <[email protected]>
solotzg pushed a commit to pingcap/tidb-engine-ext that referenced this pull request Mar 18, 2020
raftstore: retry pending read index requests (tikv#6348) (tikv#6543)

Signed-off-by: qupeng <[email protected]>

raft_client: limit batch size (tikv#6993) (tikv#7076) (tikv#7087)

Signed-off-by: qupeng <[email protected]>

backup raw kv (tikv#6308) (tikv#7051)

Co-authored-by: xinhua5 <[email protected]>
Co-authored-by: MyonKeminta <[email protected]>

tikv_util: fix s3 writer always creating zeroes (tikv#6675) (tikv#6967)

Signed-off-by: kennytm <[email protected]>

Co-authored-by: Lei Zhao <[email protected]>

 backup: do fill in the file size of each SST file in the response (tikv#6664) (tikv#6983)

Signed-off-by: kennytm <[email protected]>

*: update raft to include aggressive flow control (tikv#7078)

Signed-off-by: Jay Lee <[email protected]>
c1ay pushed a commit to c1ay/tikv that referenced this pull request May 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-cherry-pick-release-3.1 Type: Need cherry pick to release 3.1 sig/raft Component: Raft, RaftStore, etc. type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants