Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

schedule: fix panic during hot schedule #3483

Merged
merged 2 commits into from
Mar 17, 2021

Conversation

Yisaer
Copy link
Contributor

@Yisaer Yisaer commented Mar 16, 2021

Signed-off-by: Song Gao [email protected]

What problem does this PR solve?

hot-region scheduler may cause PD panic in the following order:

  1. load storesLoads before hot region scheduling
	storesLoads := cluster.GetStoresLoads()
  1. receive new PutStore request
// put Store in grpc_service.go
if err := rc.PutStore(store); err != nil {
	return nil, status.Errorf(codes.Unknown, err.Error())
}
  1. select candidates from raft.cluster
candidates = bs.cluster.GetStores()

This will cause make pickDstStores panic in following code because stLoadDetail didn't have the storeID which candidates hold as it was newly put before.

detail := bs.stLoadDetail[store.GetID()]

What is changed and how it works?

Make the candidates should be searched from stLoadDetail during hot region schedule

Check List

Tests

  • Unit test

Release note

  • No release note

@ti-chi-bot ti-chi-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 16, 2021
Signed-off-by: Song Gao <[email protected]>
Signed-off-by: Song Gao <[email protected]>
@Yisaer Yisaer marked this pull request as ready for review March 16, 2021 11:55
@ti-chi-bot ti-chi-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 16, 2021
@Yisaer Yisaer added needs-cherry-pick-release-4.0 The PR needs to cherry pick to release-4.0 branch. needs-cherry-pick-release-5.0 The PR needs to cherry pick to release-5.0 branch. labels Mar 16, 2021
@codecov
Copy link

codecov bot commented Mar 16, 2021

Codecov Report

Merging #3483 (ccbf4c4) into master (92ddb62) will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3483      +/-   ##
==========================================
+ Coverage   74.57%   74.59%   +0.01%     
==========================================
  Files         244      244              
  Lines       23981    23982       +1     
==========================================
+ Hits        17885    17889       +4     
+ Misses       4481     4480       -1     
+ Partials     1615     1613       -2     
Flag Coverage Δ
unittests 74.59% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
server/schedulers/hot_region.go 80.91% <100.00%> (+0.03%) ⬆️
pkg/errs/errs.go 75.00% <0.00%> (-25.00%) ⬇️
server/region_syncer/server.go 83.33% <0.00%> (-6.07%) ⬇️
server/tso/global_allocator.go 72.26% <0.00%> (-5.11%) ⬇️
server/encryptionkm/key_manager.go 71.78% <0.00%> (-1.66%) ⬇️
server/tso/tso.go 69.93% <0.00%> (-1.23%) ⬇️
client/base_client.go 83.24% <0.00%> (-0.55%) ⬇️
client/client.go 71.69% <0.00%> (-0.29%) ⬇️
server/grpc_service.go 49.13% <0.00%> (ø)
server/config/persist_options.go 92.12% <0.00%> (+0.78%) ⬆️
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 92ddb62...ccbf4c4. Read the comment docs.

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Mar 16, 2021
@ti-chi-bot
Copy link
Member

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • HunDunDM
  • lhy1024

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by writing /lgtm in a comment.
Reviewer can cancel approval by writing /lgtm cancel in a comment.

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Mar 17, 2021
@Yisaer
Copy link
Contributor Author

Yisaer commented Mar 17, 2021

/merge

@ti-chi-bot
Copy link
Member

@Yisaer: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: ccbf4c4

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Mar 17, 2021
@ti-chi-bot ti-chi-bot merged commit f82f2d8 into tikv:master Mar 17, 2021
ti-srebot pushed a commit to ti-srebot/pd that referenced this pull request Mar 17, 2021
@ti-srebot
Copy link
Contributor

cherry pick to release-4.0 in PR #3485

ti-srebot pushed a commit to ti-srebot/pd that referenced this pull request Mar 17, 2021
@ti-srebot
Copy link
Contributor

cherry pick to release-5.0 in PR #3486

ti-chi-bot pushed a commit that referenced this pull request Mar 18, 2021
ti-chi-bot added a commit that referenced this pull request Mar 25, 2021
* cherry pick #3483 to release-4.0

Signed-off-by: ti-srebot <[email protected]>

* fix conflict

Signed-off-by: Song Gao <[email protected]>

Co-authored-by: Song Gao <[email protected]>
Co-authored-by: Ti Chi Robot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-cherry-pick-release-4.0 The PR needs to cherry pick to release-4.0 branch. needs-cherry-pick-release-5.0 The PR needs to cherry pick to release-5.0 branch. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants