Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gcworker: fix gc miss locks when region merged during scanning & resolving locks #22252

Merged
merged 5 commits into from
Jan 7, 2021
Merged

gcworker: fix gc miss locks when region merged during scanning & resolving locks #22252

merged 5 commits into from
Jan 7, 2021

Conversation

lysu
Copy link
Contributor

@lysu lysu commented Jan 7, 2021

Signed-off-by: lysu [email protected]

What problem does this PR solve?

Issue Number: close #22245

Problem Summary:

just as TestResolveLockRangeMeetRegionEnlargeCausedByRegionMerge

  1. region1 with lock1, lock2 and region2 with lock3, lock4
  2. gc scan region1 and get lock1, lock2
  3. region2 merged into region1(so now region1 will have lock1, lock2, lock3, lock4)
  4. gc retry to resolve lock1/lock2 but meet EpochNotMatchError caused by region merge
  5. but due to gc_worker: reduce GC scan locks when meeting region cache miss #18385's bug, it doesn't re-scan region1 and thinking all locks in region1 has be resolved after resolve lock1, lock2
  6. continue resolve lock after new region1's endKey and miss lock3, lock4 which merged from old region2

What is changed and how it works?

What's Changed, How it Works:

we need re-scan region's lock when found merged region's EndKey 's larger than origional region's EndKey

and we no need to re-scan lock when region has be splitted or region's EndKey has no change

Related changes

  • Need to cherry-pick to the release branch 3.0, 4.0, 5.1

Check List

Tests

  • Unit test

Side effects

  • n/a

Release note

  • Fix gc miss locks when region merged during scanning & resolving locks

This change is Reviewable

@ichn-hu ichn-hu mentioned this pull request Jan 7, 2021
Signed-off-by: lysu <[email protected]>
if w.testingKnobs.resolveLocks != nil {
ok, err1 = w.testingKnobs.resolveLocks(loc.Region)
ok, err1 = w.testingKnobs.resolveLocks(locks, loc.Region)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ok, err1 = w.testingKnobs.resolveLocks(locks, loc.Region)
ok, err1 = w.testingKnobs.resolveLocks(locks, locForResolve.Region)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the log at line 1104 became not very meaningful 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it's ok to simply change the loc to locForResolve too in that log

Comment on lines 920 to 926
s.gcWorker.testingKnobs.scanLocks = func(key []byte, regionID uint64) []*tikv.Lock {
if regionID == s.initRegion.regionID {
return []*tikv.Lock{
{Key: []byte("a")},
{Key: []byte("b")},
{Key: []byte("o")},
{Key: []byte("p")},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can compare the key parameter with these 4 locks, so that there should be 4 locks at line 942.

@ti-srebot ti-srebot added the status/LGT1 Indicates that a PR has LGTM 1. label Jan 7, 2021
@ti-srebot ti-srebot removed the status/LGT1 Indicates that a PR has LGTM 1. label Jan 7, 2021
@ti-srebot ti-srebot added the status/LGT2 Indicates that a PR has LGTM 2. label Jan 7, 2021
@youjiali1995 youjiali1995 added this to the v5.0.0-rc milestone Jan 7, 2021
@youjiali1995
Copy link
Contributor

/merge

@ti-srebot ti-srebot added the status/can-merge Indicates a PR has been approved by a committer. label Jan 7, 2021
@ti-srebot
Copy link
Contributor

/run-all-tests

@ti-srebot ti-srebot merged commit bedd599 into pingcap:master Jan 7, 2021
ti-srebot pushed a commit to ti-srebot/tidb that referenced this pull request Jan 7, 2021
@ti-srebot
Copy link
Contributor

cherry pick to release-3.0 in PR #22266

ti-srebot pushed a commit to ti-srebot/tidb that referenced this pull request Jan 7, 2021
@ti-srebot
Copy link
Contributor

cherry pick to release-4.0 in PR #22267

@ti-srebot
Copy link
Contributor

cherry pick to release-5.0-rc in PR #22268

ti-srebot added a commit that referenced this pull request Jan 7, 2021
ti-srebot added a commit that referenced this pull request Jan 7, 2021
@tangenta tangenta added the sig/transaction SIG:Transaction label Jan 8, 2021
ti-srebot added a commit that referenced this pull request Jan 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/GC sig/transaction SIG:Transaction status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GC may miss locks when the region is merged between scanning locks and resolving locks
5 participants