Check bucket series time range when decoding chunks from index #3367

yeya24 · 2020-10-27T03:51:58Z

I added CHANGELOG entry for this change.
Change is not relevant to the end user.

Changes

This idea behind this pr is to check the chunk time range when decoding the series from the index. This change can benefit a lot the skipChunk queries because it can immediately return if it finds a valid chunk instead of decoding all the chunks in this series.

Verification

Uses the benchmarks from #3366. Benchmark and compare between this pr and current master.

SkipChunk query.

It is expected that this type of query should be much faster than before.

➜  thanos git:(improve-index-valid) ✗ benchcmp old new9
benchcmp is deprecated in favor of benchstat: https://pkg.go.dev/golang.org/x/perf/cmd/benchstat
benchmark                                                                           old ns/op      new ns/op      delta
BenchmarkBucketSeriesSkipChunks/1000000SeriesWith1Samples/1of1000000-8              374105769      304828968      -18.52%
BenchmarkBucketSeriesSkipChunks/1000000SeriesWith1Samples/10of1000000-8             366452232      287316200      -21.60%
BenchmarkBucketSeriesSkipChunks/1000000SeriesWith1Samples/1000000of1000000-8        1467578057     1077410930     -26.59%
BenchmarkBucketSeriesSkipChunks/100000SeriesWith100Samples/1of10000000-8            31472869       22883196       -27.29%
BenchmarkBucketSeriesSkipChunks/100000SeriesWith100Samples/100of10000000-8          31160010       23195988       -25.56%
BenchmarkBucketSeriesSkipChunks/100000SeriesWith100Samples/10000000of10000000-8     134128144      108503863      -19.10%
BenchmarkBucketSeriesSkipChunks/1SeriesWith10000000Samples/1of10000000-8            2564966        99784          -96.11%
BenchmarkBucketSeriesSkipChunks/1SeriesWith10000000Samples/100of10000000-8          2799881        125497         -95.52%
BenchmarkBucketSeriesSkipChunks/1SeriesWith10000000Samples/10000000of10000000-8     29997992       373188         -98.76%

Normal series query

The expected performance of this should be almost the same or a little bit slower than before. However, this pr is about 10% - 20% slower when there are a lot of samples in one series. Not sure how to improve this case. I think there is no extra overhead except we are using more if-else statements than before when decoding the index.

➜  thanos git:(improve-index-valid) ✗ benchcmp b-old c-13
benchcmp is deprecated in favor of benchstat: https://pkg.go.dev/golang.org/x/perf/cmd/benchstat
benchmark                                                                 old ns/op      new ns/op      delta
BenchmarkBucketSeries/1000000SeriesWith1Samples/1of1000000-8              317048216      274136169      -13.53%
BenchmarkBucketSeries/1000000SeriesWith1Samples/10of1000000-8             316798190      278900447      -11.96%
BenchmarkBucketSeries/1000000SeriesWith1Samples/1000000of1000000-8        1535423045     1519882954     -1.01%
BenchmarkBucketSeries/100000SeriesWith100Samples/1of10000000-8            25975122       23109618       -11.03%
BenchmarkBucketSeries/100000SeriesWith100Samples/100of10000000-8          25753028       23400939       -9.13%
BenchmarkBucketSeries/100000SeriesWith100Samples/10000000of10000000-8     138522971      137316413      -0.87%
BenchmarkBucketSeries/1SeriesWith10000000Samples/1of10000000-8            2516395        129818         -94.84%
BenchmarkBucketSeries/1SeriesWith10000000Samples/100of10000000-8          2500083        152185         -93.91%
BenchmarkBucketSeries/1SeriesWith10000000Samples/10000000of10000000-8     33321908       36184470       +8.59%

bwplotka · 2020-10-28T13:28:28Z

Running benchmarks is tricky but there are ways to make it reliable and more trustworthy.

If you run Goland ID you have benchtime only 1s, so this one thing to improve is to increase this time. You can to that by edit config and setting extra program -test.benchtime=10s
The other must have flag is to either set benchmem or specify in the code b.ReportAllocs()
Useful thing is to limit to 1 or 2 CPUs, because that's the usual value we give for Store GW.

bwplotka · 2020-10-28T13:38:54Z

Anyway good work I think it makes sense, wonder if we can clean decodeWithReq a bit more 🤗

Some reference for benchmarks: https://gist.github.com/bwplotka/3b853c31ed11e77c975b9df45d105d74

yeya24 · 2020-10-28T19:47:35Z

I finished the benchmarks and the new results between this pr and the current master are below. Seems it is not very good when there are not many samples per series.

SkipChunks

name                                                                  old time/op    new time/op    delta
BucketSeriesSkipChunks/1000000SeriesWith1Samples/1of1000000              423ms ± 0%     442ms ± 0%   +4.32%
BucketSeriesSkipChunks/1000000SeriesWith1Samples/10of1000000             496ms ± 0%     458ms ± 0%   -7.56%
BucketSeriesSkipChunks/1000000SeriesWith1Samples/1000000of1000000        3.39s ± 0%     3.31s ± 0%   -2.29%
BucketSeriesSkipChunks/100000SeriesWith100Samples/1of10000000           38.6ms ± 0%    36.6ms ± 0%   -5.18%
BucketSeriesSkipChunks/100000SeriesWith100Samples/100of10000000         36.1ms ± 0%    42.3ms ± 0%  +17.31%
BucketSeriesSkipChunks/100000SeriesWith100Samples/10000000of10000000     276ms ± 0%     278ms ± 0%   +0.76%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/1of10000000           4.53ms ± 0%    0.13ms ± 0%  -97.06%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/100of10000000         4.13ms ± 0%    0.13ms ± 0%  -96.79%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/10000000of10000000    16.5ms ± 0%     0.4ms ± 0%  -97.48%

name                                                                  old alloc/op   new alloc/op   delta
BucketSeriesSkipChunks/1000000SeriesWith1Samples/1of1000000             98.2MB ± 0%    98.2MB ± 0%   -0.00%
BucketSeriesSkipChunks/1000000SeriesWith1Samples/10of1000000            98.2MB ± 0%    98.2MB ± 0%   -0.00%
BucketSeriesSkipChunks/1000000SeriesWith1Samples/1000000of1000000       1.00GB ± 0%    1.00GB ± 0%   -0.00%
BucketSeriesSkipChunks/100000SeriesWith100Samples/1of10000000           8.43MB ± 0%    8.43MB ± 0%   +0.00%
BucketSeriesSkipChunks/100000SeriesWith100Samples/100of10000000         8.43MB ± 0%    8.43MB ± 0%   -0.00%
BucketSeriesSkipChunks/100000SeriesWith100Samples/10000000of10000000    95.9MB ± 0%    95.9MB ± 0%   -0.00%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/1of10000000           4.34MB ± 0%    0.17MB ± 0%  -96.00%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/100of10000000         4.34MB ± 0%    0.17MB ± 0%  -96.00%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/10000000of10000000    17.3MB ± 0%     0.7MB ± 0%  -96.06%

name                                                                  old allocs/op  new allocs/op  delta
BucketSeriesSkipChunks/1000000SeriesWith1Samples/1of1000000               760k ± 0%      760k ± 0%   -0.00%
BucketSeriesSkipChunks/1000000SeriesWith1Samples/10of1000000              760k ± 0%      760k ± 0%   -0.00%
BucketSeriesSkipChunks/1000000SeriesWith1Samples/1000000of1000000        8.04M ± 0%     8.04M ± 0%   -0.00%
BucketSeriesSkipChunks/100000SeriesWith100Samples/1of10000000            76.1k ± 0%     76.1k ± 0%    0.00%
BucketSeriesSkipChunks/100000SeriesWith100Samples/100of10000000          76.1k ± 0%     76.1k ± 0%   -0.00%
BucketSeriesSkipChunks/100000SeriesWith100Samples/10000000of10000000      804k ± 0%      804k ± 0%   -0.00%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/1of10000000              195 ± 0%       172 ± 0%  -11.79%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/100of10000000            195 ± 0%       172 ± 0%  -11.79%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/10000000of10000000       543 ± 0%       445 ± 0%  -18.05%

Normal query

name                                                        old time/op    new time/op    delta
BucketSeries/1000000SeriesWith1Samples/1of1000000              451ms ± 0%     432ms ± 0%   -4.07%
BucketSeries/1000000SeriesWith1Samples/10of1000000             450ms ± 0%     450ms ± 0%   -0.09%
BucketSeries/1000000SeriesWith1Samples/1000000of1000000        3.82s ± 0%     4.27s ± 0%  +11.66%
BucketSeries/100000SeriesWith100Samples/1of10000000           36.7ms ± 0%    36.2ms ± 0%   -1.34%
BucketSeries/100000SeriesWith100Samples/100of10000000         36.1ms ± 0%    38.0ms ± 0%   +5.44%
BucketSeries/100000SeriesWith100Samples/10000000of10000000     321ms ± 0%     359ms ± 0%  +12.04%
BucketSeries/1SeriesWith10000000Samples/1of10000000           5.07ms ± 0%    0.17ms ± 0%  -96.56%
BucketSeries/1SeriesWith10000000Samples/100of10000000         4.80ms ± 0%    0.17ms ± 0%  -96.51%
BucketSeries/1SeriesWith10000000Samples/10000000of10000000    77.6ms ± 0%    81.6ms ± 0%   +5.08%

name                                                        old alloc/op   new alloc/op   delta
BucketSeries/1000000SeriesWith1Samples/1of1000000              117MB ± 0%      98MB ± 0%  -16.35%
BucketSeries/1000000SeriesWith1Samples/10of1000000             117MB ± 0%      98MB ± 0%  -16.35%
BucketSeries/1000000SeriesWith1Samples/1000000of1000000       1.31GB ± 0%    1.31GB ± 0%   +0.18%
BucketSeries/100000SeriesWith100Samples/1of10000000           10.4MB ± 0%     8.5MB ± 0%  -18.46%
BucketSeries/100000SeriesWith100Samples/100of10000000         10.4MB ± 0%     8.4MB ± 0%  -18.55%
BucketSeries/100000SeriesWith100Samples/10000000of10000000     126MB ± 0%     126MB ± 0%   +0.38%
BucketSeries/1SeriesWith10000000Samples/1of10000000           5.90MB ± 0%    0.22MB ± 0%  -96.19%
BucketSeries/1SeriesWith10000000Samples/100of10000000         5.90MB ± 0%    0.22MB ± 0%  -96.19%
BucketSeries/1SeriesWith10000000Samples/10000000of10000000    38.0MB ± 0%    38.0MB ± 0%   -0.00%

name                                                        old allocs/op  new allocs/op  delta
BucketSeries/1000000SeriesWith1Samples/1of1000000              1.26M ± 0%     0.76M ± 0%  -39.69%
BucketSeries/1000000SeriesWith1Samples/10of1000000             1.26M ± 0%     0.76M ± 0%  -39.69%
BucketSeries/1000000SeriesWith1Samples/1000000of1000000        12.1M ± 0%     12.1M ± 0%   -0.00%
BucketSeries/100000SeriesWith100Samples/1of10000000             126k ± 0%       76k ± 0%  -39.65%
BucketSeries/100000SeriesWith100Samples/100of10000000           126k ± 0%       76k ± 0%  -39.64%
BucketSeries/100000SeriesWith100Samples/10000000of10000000     1.21M ± 0%     1.21M ± 0%   -0.00%
BucketSeries/1SeriesWith10000000Samples/1of10000000              232 ± 0%       205 ± 0%  -11.64%
BucketSeries/1SeriesWith10000000Samples/100of10000000            231 ± 0%       205 ± 0%  -11.26%
BucketSeries/1SeriesWith10000000Samples/10000000of10000000      170k ± 0%      170k ± 0%   -0.00%

yeya24 · 2020-10-28T23:00:40Z

After one more optimization, I think the performance is good now. That's the result:

SkipChunks

name                                                                  old time/op    new time/op    delta
BucketSeriesSkipChunks/1000000SeriesWith1Samples/1of1000000              423ms ± 0%     343ms ± 0%  -19.11%
BucketSeriesSkipChunks/1000000SeriesWith1Samples/10of1000000             496ms ± 0%     345ms ± 0%  -30.38%
BucketSeriesSkipChunks/1000000SeriesWith1Samples/1000000of1000000        3.39s ± 0%     2.85s ± 0%  -15.84%
BucketSeriesSkipChunks/100000SeriesWith100Samples/1of10000000           38.6ms ± 0%    26.6ms ± 0%  -31.15%
BucketSeriesSkipChunks/100000SeriesWith100Samples/100of10000000         36.1ms ± 0%    26.8ms ± 0%  -25.80%
BucketSeriesSkipChunks/100000SeriesWith100Samples/10000000of10000000     276ms ± 0%     257ms ± 0%   -6.70%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/1of10000000           4.53ms ± 0%    0.12ms ± 0%  -97.27%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/100of10000000         4.13ms ± 0%    0.12ms ± 0%  -96.98%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/10000000of10000000    16.5ms ± 0%     0.4ms ± 0%  -97.48%

name                                                                  old alloc/op   new alloc/op   delta
BucketSeriesSkipChunks/1000000SeriesWith1Samples/1of1000000             98.2MB ± 0%    74.2MB ± 0%  -24.44%
BucketSeriesSkipChunks/1000000SeriesWith1Samples/10of1000000            98.2MB ± 0%    74.2MB ± 0%  -24.44%
BucketSeriesSkipChunks/1000000SeriesWith1Samples/1000000of1000000       1.00GB ± 0%    1.00GB ± 0%   +0.00%
BucketSeriesSkipChunks/100000SeriesWith100Samples/1of10000000           8.43MB ± 0%    6.03MB ± 0%  -28.48%
BucketSeriesSkipChunks/100000SeriesWith100Samples/100of10000000         8.43MB ± 0%    6.03MB ± 0%  -28.47%
BucketSeriesSkipChunks/100000SeriesWith100Samples/10000000of10000000    95.9MB ± 0%    95.9MB ± 0%   -0.00%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/1of10000000           4.34MB ± 0%    0.17MB ± 0%  -96.00%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/100of10000000         4.34MB ± 0%    0.17MB ± 0%  -96.00%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/10000000of10000000    17.3MB ± 0%     0.7MB ± 0%  -96.06%

name                                                                  old allocs/op  new allocs/op  delta
BucketSeriesSkipChunks/1000000SeriesWith1Samples/1of1000000               760k ± 0%      510k ± 0%  -32.91%
BucketSeriesSkipChunks/1000000SeriesWith1Samples/10of1000000              760k ± 0%      510k ± 0%  -32.91%
BucketSeriesSkipChunks/1000000SeriesWith1Samples/1000000of1000000        8.04M ± 0%     8.04M ± 0%   +0.00%
BucketSeriesSkipChunks/100000SeriesWith100Samples/1of10000000            76.1k ± 0%     51.1k ± 0%  -32.86%
BucketSeriesSkipChunks/100000SeriesWith100Samples/100of10000000          76.1k ± 0%     51.1k ± 0%  -32.85%
BucketSeriesSkipChunks/100000SeriesWith100Samples/10000000of10000000      804k ± 0%      804k ± 0%   -0.00%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/1of10000000              195 ± 0%       172 ± 0%  -11.79%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/100of10000000            195 ± 0%       172 ± 0%  -11.79%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/10000000of10000000       543 ± 0%       445 ± 0%  -18.05%

Normal series query

name                                                        old time/op    new time/op    delta
BucketSeries/1000000SeriesWith1Samples/1of1000000              451ms ± 0%     340ms ± 0%  -24.52%
BucketSeries/1000000SeriesWith1Samples/10of1000000             450ms ± 0%     345ms ± 0%  -23.46%
BucketSeries/1000000SeriesWith1Samples/1000000of1000000        3.82s ± 0%     3.85s ± 0%   +0.60%
BucketSeries/100000SeriesWith100Samples/1of10000000           36.7ms ± 0%    26.8ms ± 0%  -26.94%
BucketSeries/100000SeriesWith100Samples/100of10000000         36.1ms ± 0%    26.4ms ± 0%  -26.70%
BucketSeries/100000SeriesWith100Samples/10000000of10000000     321ms ± 0%     314ms ± 0%   -2.23%
BucketSeries/1SeriesWith10000000Samples/1of10000000           5.07ms ± 0%    0.15ms ± 0%  -97.03%
BucketSeries/1SeriesWith10000000Samples/100of10000000         4.80ms ± 0%    0.15ms ± 0%  -96.87%
BucketSeries/1SeriesWith10000000Samples/10000000of10000000    77.6ms ± 0%    70.5ms ± 0%   -9.23%

name                                                        old alloc/op   new alloc/op   delta
BucketSeries/1000000SeriesWith1Samples/1of1000000              117MB ± 0%      74MB ± 0%  -36.79%
BucketSeries/1000000SeriesWith1Samples/10of1000000             117MB ± 0%      74MB ± 0%  -36.80%
BucketSeries/1000000SeriesWith1Samples/1000000of1000000       1.31GB ± 0%    1.31GB ± 0%   +0.53%
BucketSeries/100000SeriesWith100Samples/1of10000000           10.4MB ± 0%     6.1MB ± 0%  -41.54%
BucketSeries/100000SeriesWith100Samples/100of10000000         10.4MB ± 0%     6.0MB ± 0%  -41.73%
BucketSeries/100000SeriesWith100Samples/10000000of10000000     126MB ± 0%     126MB ± 0%   +0.07%
BucketSeries/1SeriesWith10000000Samples/1of10000000           5.90MB ± 0%    0.22MB ± 0%  -96.19%
BucketSeries/1SeriesWith10000000Samples/100of10000000         5.90MB ± 0%    0.22MB ± 0%  -96.19%
BucketSeries/1SeriesWith10000000Samples/10000000of10000000    38.0MB ± 0%    38.0MB ± 0%   +0.00%

name                                                        old allocs/op  new allocs/op  delta
BucketSeries/1000000SeriesWith1Samples/1of1000000              1.26M ± 0%     0.51M ± 0%  -59.54%
BucketSeries/1000000SeriesWith1Samples/10of1000000             1.26M ± 0%     0.51M ± 0%  -59.53%
BucketSeries/1000000SeriesWith1Samples/1000000of1000000        12.1M ± 0%     12.1M ± 0%   +0.00%
BucketSeries/100000SeriesWith100Samples/1of10000000             126k ± 0%       51k ± 0%  -59.48%
BucketSeries/100000SeriesWith100Samples/100of10000000           126k ± 0%       51k ± 0%  -59.46%
BucketSeries/100000SeriesWith100Samples/10000000of10000000     1.21M ± 0%     1.21M ± 0%   +0.00%
BucketSeries/1SeriesWith10000000Samples/1of10000000              232 ± 0%       205 ± 0%  -11.64%
BucketSeries/1SeriesWith10000000Samples/100of10000000            231 ± 0%       205 ± 0%  -11.26%
BucketSeries/1SeriesWith10000000Samples/10000000of10000000      170k ± 0%      170k ± 0%   +0.00%

Signed-off-by: Ben Ye <[email protected]>

bwplotka

Amazing work! LGTM 💪

pracucci · 2020-11-12T15:47:27Z

pkg/store/bucket.go

+		ref0 += d.Varint64()
+		t0 = maxt


This should be done before if maxt < req.MinTime, otherwise the next for iteration will read invalid data. This PR #3440 fixes it.

…s-io#3367) * check bucket series time range when decoding chunks from index Signed-off-by: Ben Ye <[email protected]> * reduce alloc Signed-off-by: Ben Ye <[email protected]> Signed-off-by: Oghenebrume50 <[email protected]>

yeya24 requested review from kakkoyun and bwplotka October 27, 2020 03:53

yeya24 marked this pull request as ready for review October 28, 2020 23:03

yeya24 force-pushed the improve-index-valid branch from c0b6a99 to 2f0ee7d Compare October 29, 2020 15:36

yeya24 added 2 commits October 29, 2020 19:28

check bucket series time range when decoding chunks from index

242f40e

Signed-off-by: Ben Ye <[email protected]>

reduce alloc

79f0eff

Signed-off-by: Ben Ye <[email protected]>

yeya24 force-pushed the improve-index-valid branch from 2f0ee7d to 79f0eff Compare October 29, 2020 23:28

yeya24 requested a review from pracucci October 30, 2020 13:38

bwplotka approved these changes Nov 4, 2020

View reviewed changes

bwplotka merged commit 262efad into thanos-io:master Nov 4, 2020

pracucci mentioned this pull request Nov 12, 2020

Fix partial query results #3440

Merged

2 tasks

pracucci reviewed Nov 12, 2020

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check bucket series time range when decoding chunks from index #3367

Check bucket series time range when decoding chunks from index #3367

yeya24 commented Oct 27, 2020 •

edited

Loading

bwplotka commented Oct 28, 2020

bwplotka commented Oct 28, 2020

yeya24 commented Oct 28, 2020 •

edited

Loading

yeya24 commented Oct 28, 2020

bwplotka left a comment

pracucci Nov 12, 2020

Check bucket series time range when decoding chunks from index #3367

Check bucket series time range when decoding chunks from index #3367

Conversation

yeya24 commented Oct 27, 2020 • edited Loading

Changes

bwplotka commented Oct 28, 2020

bwplotka commented Oct 28, 2020

yeya24 commented Oct 28, 2020 • edited Loading

yeya24 commented Oct 28, 2020

bwplotka left a comment

Choose a reason for hiding this comment

pracucci Nov 12, 2020

Choose a reason for hiding this comment

yeya24 commented Oct 27, 2020 •

edited

Loading

yeya24 commented Oct 28, 2020 •

edited

Loading