-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mvcc: fix TestHashKVWhenCompacting hash mismatch #8314
Conversation
@@ -522,7 +522,7 @@ func TestHashKVWhenCompacting(t *testing.T) { | |||
s := NewStore(b, &lease.FakeLessor{}, nil) | |||
defer os.Remove(tmpPath) | |||
|
|||
rev := 1000 | |||
rev := 10000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this change will trigger the test failure every time before the patch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on my machine 100000 triggers everytime. I lower it to 10000 since ci machine might be slow which could trigger timeout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can confirm 100k repro
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how long would it take to run 100K?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it takes an extra second or so with my machine; the txn buffering keeps it cheap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just change to 100k then? @fanminshi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do.
mvcc/kvstore.go
Outdated
s.revMu.RLock() | ||
if rev == s.compactMainRev { | ||
s.keepMu.Lock() | ||
s.keep = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will there be a similar issue with s.keep = keep
assigning a keep
that's ahead of the current scheduled compaction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also maybe have a comment about why this is important...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand how keep works, the keep from this compaction only contains revisions between compact rev of last compaction to this compaction rev. If last compaction hasn't finished then, then the current keep does not include the keep from last compaction. Hence, HashKV will skip revisions from the keep of last compaction which is not what we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am thinking about appending keep of this compaction to the keep of last compaction if last compaction is still in progress, and only the completion of the very last compaction sets keep to nil.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems too delicate. It might be easier to extend the index
interface to have a non-destructive Compact which computes the maximum revisions less than a given rev (e.g., the compact rev) and just run that every time HashKV is called.
mvcc/kvstore_test.go
Outdated
@@ -558,7 +558,7 @@ func TestHashKVWhenCompacting(t *testing.T) { | |||
revHash[r.compactRev] = r.hash | |||
} | |||
if r.hash != revHash[r.compactRev] { | |||
t.Fatalf("Hashes differ (current %v) != (saved %v)", r.hash, revHash[r.compactRev]) | |||
t.Fatalf("Hashes differ (current %v) != (saved %v) at %v", r.hash, revHash[r.compactRev], r.compactRev) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/Hashes/hashes while you're changing this; all lower case is idiomatic
via https://semaphoreci.com/coreos/etcd/branches/pull-request-8314/builds/1 --- FAIL: TestV3HashKV (0.44s)
v3_grpc_test.go:186: prevHash 3992618000 != Hash 2585771259 edit: should be fixed by #8323. |
08ba6b6
to
0f37ae3
Compare
This pr adds a check in compaction to ensure that setting keep to nil only if this is the latest compaction in queue and merges the keep from prev ongoing compaction to this scheduled compaction allows HashKV to computes Hash from previous keep. Scenario #1 Suppose that HashKV() starts after compact(A) and compact(B) has scheduled and after completion of compact(A) but before the completion of compact(B). Before this PR: Completion of compact(A) sets keep to nil to indicate that there are no ongoing compactions. This assumption causes HashKV to hash all mvcc keys including keys that hasn't been deleted by compact(B) up to a rev. Hence, HashKV again after compact(B) returns a different Hash that during compact(B). After this PR: Completion of compact(A) sets keep to nil if current compactRev == compactRev(compact(A)). Since calling compact(B) changes current compactRev, then current compactRev != compactRev(compact(A)). Then HashKV will base on the keep set during compact(B) instead of nil which return the correct hash during compact(B). Scenario etcd-io#2 Suppose that HashKV() starts after compact(A) and compact(B) has scheduled and before the completion of compact(A) and compact(B). Before this PR: The start of compact(B) sets keep to compact(B) and overrides the keep from compact(A). Since compact(A) hasn't finished yet, HashKV needs to hash revision from keep of compact(A) but it is overridden. Hence HashKV misses revision from keep of compact(A) would result hash difference. After this PR: The start of compact(B) merges keep from compact(B) and compact(A). Even though compact(A) hasn't finished, HashKV can retrieve from keep of compact(A). Hence HashKV doesn't miss any revisions and will compute Hash correctly.
0f37ae3
to
89f0276
Compare
89f0276
to
f402b8b
Compare
@fanminshi can this be closed? |
close in favor of #8333. |
This pr adds a check in compaction to ensure that setting keep to nil only if this is the latest compaction in queue.
Suppose that HashKV() starts after compact(A) and compact(B) has scheduled and after
completion of compact(A) but before the completion of compact(B).
Before this PR:
Completion of compact(A) sets keep to nil to indicate that there are no ongoing compactions.
This assumption causes HashKV to hash all mvcc keys including keys that
hasn't been deleted by compact(B) up to a rev. Hence, HashKV again after completion of compact(B) returns
a different Hash that during compact(B).
After this PR:
Completion of compact(A) sets keep to nil if current compactRev == compactRev(compact(A)).
Since calling compact(B) changes current compactRev, then current compactRev != compactRev(compact(A)).
Then HashKV will base on the keep set during compact(B) instead of nil which return the correct hash
during compact(B).
fixes #8311