-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachperf: regression in kv on 2023/03/10 - multiple benchmarks #98571
Comments
cc @cockroachdb/test-eng |
Bisection results using
|
Thanks @smg260 for determining the cause of the regression. I can see it in microbenchmarks as well:
|
This looks to be due to a misunderstanding of how cockroach/pkg/kv/kvserver/kvserverbase/base.go Lines 189 to 196 in a79338a
As a result, 1PC txns were launching async intent resolution tasks for local point lock spans. The following diff resolves the regression: diff --git a/pkg/kv/kvserver/kvserverbase/base.go b/pkg/kv/kvserver/kvserverbase/base.go
index 09ed15547ae..338b12d1b6c 100644
--- a/pkg/kv/kvserver/kvserverbase/base.go
+++ b/pkg/kv/kvserver/kvserverbase/base.go
@@ -164,6 +164,9 @@ func IntersectSpan(
) (middle *roachpb.Span, outside []roachpb.Span) {
start, end := desc.StartKey.AsRawKey(), desc.EndKey.AsRawKey()
if len(span.EndKey) == 0 {
+ if ContainsKey(desc, span.Key) {
+ return &span, nil
+ }
outside = append(outside, span)
return
} The non-1PC EndTxn path also uses this utility function, but only for ranged intent spans. |
Hi @nvanbenschoten, please add branch-* labels to identify which branch(es) this release-blocker affects. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Fixes cockroachdb#98571. This commit fixes the regression detected in cockroachdb#98571. In that issue, we saw that the bug fix in 86a5852 (cockroachdb#98044) caused a regression in kv0 (and other benchmarks). This was due to a bug in `kvserverbase.IntersectSpan`, which was considering local point spans to be external to the provided range span. This commit fixes the bug by not calling `kvserverbase.IntersectSpan` for point lock spans. The commit also makes the utility panic instead of silently returning incorrect results. There's an existing TODO on the utility to generalize it. For now, we just make it harder to misuse. Finally, we add a test that asserts against the presence of async intent resolution after one-phase commits when external intent resolution is not needed. ``` name old time/op new time/op delta KV/Insert/Native/rows=1-10 61.2µs ± 3% 48.9µs ± 3% -20.10% (p=0.000 n=8+9) KV/Insert/Native/rows=10-10 93.3µs ±15% 76.2µs ± 3% -18.34% (p=0.000 n=9+9) KV/Insert/Native/rows=1000-10 2.84ms ±12% 2.42ms ± 4% -14.97% (p=0.000 n=9+9) KV/Insert/Native/rows=100-10 365µs ± 5% 320µs ± 8% -12.40% (p=0.000 n=10+9) KV/Insert/Native/rows=10000-10 27.6ms ± 6% 24.4ms ± 3% -11.53% (p=0.000 n=9+9) name old alloc/op new alloc/op delta KV/Insert/Native/rows=1000-10 4.66MB ± 1% 2.76MB ± 1% -40.89% (p=0.000 n=9+9) KV/Insert/Native/rows=100-10 478kB ± 1% 287kB ± 1% -39.90% (p=0.000 n=10+10) KV/Insert/Native/rows=10000-10 54.2MB ± 2% 34.3MB ± 3% -36.73% (p=0.000 n=10+10) KV/Insert/Native/rows=10-10 64.2kB ± 1% 42.1kB ± 1% -34.39% (p=0.000 n=10+9) KV/Insert/Native/rows=1-10 22.1kB ± 1% 17.3kB ± 1% -21.56% (p=0.000 n=9+10) name old allocs/op new allocs/op delta KV/Insert/Native/rows=1000-10 21.5k ± 0% 14.7k ± 0% -31.70% (p=0.000 n=8+9) KV/Insert/Native/rows=10000-10 212k ± 0% 146k ± 0% -31.31% (p=0.000 n=9+10) KV/Insert/Native/rows=100-10 2.34k ± 1% 1.61k ± 0% -31.31% (p=0.000 n=10+10) KV/Insert/Native/rows=10-10 392 ± 1% 276 ± 0% -29.59% (p=0.000 n=8+8) KV/Insert/Native/rows=1-10 173 ± 1% 123 ± 0% -29.04% (p=0.000 n=9+8) ``` Release note: None
98544: colmem: allow Allocator max batch size to be customized r=cucaroach a=cucaroach Previously this was hardcoded to coldata.BatchSize or 1024, now it can be increased or decreased. Epic: CRDB-18892 Informs: #91831 Release note: None 98630: kv: don't perform async intent resolution on 1PC with point lock spans r=arulajmani a=nvanbenschoten Fixes #98571. This commit fixes the regression detected in #98571. In that issue, we saw that the bug fix in 86a5852 (#98044) caused a regression in kv0 (and other benchmarks). This was due to a bug in `kvserverbase.IntersectSpan`, which was considering local point spans to be external to the provided range span. This commit fixes the bug by not calling `kvserverbase.IntersectSpan` for point lock spans. The commit also makes the utility panic instead of silently returning incorrect results. There's an existing TODO on the utility to generalize it. For now, we just make it harder to misuse. Finally, we add a test that asserts against the presence of async intent resolution after one-phase commits when external intent resolution is not needed. ``` name old time/op new time/op delta KV/Insert/Native/rows=1-10 61.2µs ± 3% 48.9µs ± 3% -20.10% (p=0.000 n=8+9) KV/Insert/Native/rows=10-10 93.3µs ±15% 76.2µs ± 3% -18.34% (p=0.000 n=9+9) KV/Insert/Native/rows=1000-10 2.84ms ±12% 2.42ms ± 4% -14.97% (p=0.000 n=9+9) KV/Insert/Native/rows=100-10 365µs ± 5% 320µs ± 8% -12.40% (p=0.000 n=10+9) KV/Insert/Native/rows=10000-10 27.6ms ± 6% 24.4ms ± 3% -11.53% (p=0.000 n=9+9) name old alloc/op new alloc/op delta KV/Insert/Native/rows=1000-10 4.66MB ± 1% 2.76MB ± 1% -40.89% (p=0.000 n=9+9) KV/Insert/Native/rows=100-10 478kB ± 1% 287kB ± 1% -39.90% (p=0.000 n=10+10) KV/Insert/Native/rows=10000-10 54.2MB ± 2% 34.3MB ± 3% -36.73% (p=0.000 n=10+10) KV/Insert/Native/rows=10-10 64.2kB ± 1% 42.1kB ± 1% -34.39% (p=0.000 n=10+9) KV/Insert/Native/rows=1-10 22.1kB ± 1% 17.3kB ± 1% -21.56% (p=0.000 n=9+10) name old allocs/op new allocs/op delta KV/Insert/Native/rows=1000-10 21.5k ± 0% 14.7k ± 0% -31.70% (p=0.000 n=8+9) KV/Insert/Native/rows=10000-10 212k ± 0% 146k ± 0% -31.31% (p=0.000 n=9+10) KV/Insert/Native/rows=100-10 2.34k ± 1% 1.61k ± 0% -31.31% (p=0.000 n=10+10) KV/Insert/Native/rows=10-10 392 ± 1% 276 ± 0% -29.59% (p=0.000 n=8+8) KV/Insert/Native/rows=1-10 173 ± 1% 123 ± 0% -29.04% (p=0.000 n=9+8) ``` Release note: None Co-authored-by: Tommy Reilly <[email protected]> Co-authored-by: Nathan VanBenschoten <[email protected]>
Significant drop on or around March 10th.
Affects multiple kv0 and kv95
Jira issue: CRDB-25340
The text was updated successfully, but these errors were encountered: