Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mvcc: fix TestHashKVWhenCompacting hash mismatch #8314

Closed
wants to merge 2 commits into from

Conversation

fanminshi
Copy link
Member

@fanminshi fanminshi commented Jul 26, 2017

This pr adds a check in compaction to ensure that setting keep to nil only if this is the latest compaction in queue.

Suppose that HashKV() starts after compact(A) and compact(B) has scheduled and after
completion of compact(A) but before the completion of compact(B).

Before this PR:
Completion of compact(A) sets keep to nil to indicate that there are no ongoing compactions.
This assumption causes HashKV to hash all mvcc keys including keys that
hasn't been deleted by compact(B) up to a rev. Hence, HashKV again after completion of compact(B) returns
a different Hash that during compact(B).

After this PR:
Completion of compact(A) sets keep to nil if current compactRev == compactRev(compact(A)).
Since calling compact(B) changes current compactRev, then current compactRev != compactRev(compact(A)).
Then HashKV will base on the keep set during compact(B) instead of nil which return the correct hash
during compact(B).

fixes #8311

@@ -522,7 +522,7 @@ func TestHashKVWhenCompacting(t *testing.T) {
s := NewStore(b, &lease.FakeLessor{}, nil)
defer os.Remove(tmpPath)

rev := 1000
rev := 10000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change will trigger the test failure every time before the patch?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on my machine 100000 triggers everytime. I lower it to 10000 since ci machine might be slow which could trigger timeout.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can confirm 100k repro

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how long would it take to run 100K?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it takes an extra second or so with my machine; the txn buffering keeps it cheap

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just change to 100k then? @fanminshi

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do.

mvcc/kvstore.go Outdated
s.revMu.RLock()
if rev == s.compactMainRev {
s.keepMu.Lock()
s.keep = nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will there be a similar issue with s.keep = keep assigning a keep that's ahead of the current scheduled compaction?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also maybe have a comment about why this is important...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand how keep works, the keep from this compaction only contains revisions between compact rev of last compaction to this compaction rev. If last compaction hasn't finished then, then the current keep does not include the keep from last compaction. Hence, HashKV will skip revisions from the keep of last compaction which is not what we want.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking about appending keep of this compaction to the keep of last compaction if last compaction is still in progress, and only the completion of the very last compaction sets keep to nil.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems too delicate. It might be easier to extend the index interface to have a non-destructive Compact which computes the maximum revisions less than a given rev (e.g., the compact rev) and just run that every time HashKV is called.

@@ -558,7 +558,7 @@ func TestHashKVWhenCompacting(t *testing.T) {
revHash[r.compactRev] = r.hash
}
if r.hash != revHash[r.compactRev] {
t.Fatalf("Hashes differ (current %v) != (saved %v)", r.hash, revHash[r.compactRev])
t.Fatalf("Hashes differ (current %v) != (saved %v) at %v", r.hash, revHash[r.compactRev], r.compactRev)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Hashes/hashes while you're changing this; all lower case is idiomatic

@fanminshi
Copy link
Member Author

fanminshi commented Jul 27, 2017

via https://semaphoreci.com/coreos/etcd/branches/pull-request-8314/builds/1

--- FAIL: TestV3HashKV (0.44s)
	v3_grpc_test.go:186: prevHash 3992618000 != Hash 2585771259

edit: should be fixed by #8323.

This pr adds a check in compaction to ensure that setting keep to nil only if this is the latest compaction in queue and
merges the keep from prev ongoing compaction to this scheduled compaction allows HashKV to computes Hash from previous keep.

Scenario #1

Suppose that HashKV() starts after compact(A) and compact(B) has scheduled and after
completion of compact(A) but before the completion of compact(B).

Before this PR:
Completion of compact(A) sets keep to nil to indicate that there are no ongoing compactions.
This assumption causes HashKV to hash all mvcc keys including keys that
hasn't been deleted by compact(B) up to a rev. Hence, HashKV again after compact(B) returns
a different Hash that during compact(B).

After this PR:
Completion of compact(A) sets keep to nil if current compactRev == compactRev(compact(A)).
Since calling compact(B) changes current compactRev, then current compactRev != compactRev(compact(A)).
Then HashKV will base on the keep set during compact(B) instead of nil which return the correct hash
during compact(B).

Scenario etcd-io#2

Suppose that HashKV() starts after compact(A) and compact(B) has scheduled and before the
completion of compact(A) and compact(B).

Before this PR:
The start of compact(B) sets keep to compact(B) and overrides the keep from compact(A).
Since compact(A) hasn't finished yet, HashKV needs to hash revision from keep of compact(A) but it
is overridden. Hence HashKV misses revision from keep of compact(A) would result hash difference.

After this PR:
The start of compact(B) merges keep from compact(B) and compact(A). Even though compact(A) hasn't finished,
HashKV can retrieve from keep of compact(A). Hence HashKV doesn't miss any revisions and will compute Hash correctly.
@fanminshi fanminshi changed the title mvcc: fix hash mismatch mvcc: fix TestHashKVWhenCompacting hash mismatch Jul 28, 2017
@heyitsanthony
Copy link
Contributor

@fanminshi can this be closed?

@fanminshi
Copy link
Member Author

close in favor of #8333.

@fanminshi fanminshi closed this Jul 31, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

TestHashKVWhenCompacting: hashes differ
3 participants