Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check bucket series time range when decoding chunks from index #3367

Merged
merged 2 commits into from
Nov 4, 2020

Conversation

yeya24
Copy link
Contributor

@yeya24 yeya24 commented Oct 27, 2020

Signed-off-by: Ben Ye [email protected]

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

This idea behind this pr is to check the chunk time range when decoding the series from the index. This change can benefit a lot the skipChunk queries because it can immediately return if it finds a valid chunk instead of decoding all the chunks in this series.

Verification

Uses the benchmarks from #3366. Benchmark and compare between this pr and current master.

  1. SkipChunk query.

It is expected that this type of query should be much faster than before.

➜  thanos git:(improve-index-valid) ✗ benchcmp old new9
benchcmp is deprecated in favor of benchstat: https://pkg.go.dev/golang.org/x/perf/cmd/benchstat
benchmark                                                                           old ns/op      new ns/op      delta
BenchmarkBucketSeriesSkipChunks/1000000SeriesWith1Samples/1of1000000-8              374105769      304828968      -18.52%
BenchmarkBucketSeriesSkipChunks/1000000SeriesWith1Samples/10of1000000-8             366452232      287316200      -21.60%
BenchmarkBucketSeriesSkipChunks/1000000SeriesWith1Samples/1000000of1000000-8        1467578057     1077410930     -26.59%
BenchmarkBucketSeriesSkipChunks/100000SeriesWith100Samples/1of10000000-8            31472869       22883196       -27.29%
BenchmarkBucketSeriesSkipChunks/100000SeriesWith100Samples/100of10000000-8          31160010       23195988       -25.56%
BenchmarkBucketSeriesSkipChunks/100000SeriesWith100Samples/10000000of10000000-8     134128144      108503863      -19.10%
BenchmarkBucketSeriesSkipChunks/1SeriesWith10000000Samples/1of10000000-8            2564966        99784          -96.11%
BenchmarkBucketSeriesSkipChunks/1SeriesWith10000000Samples/100of10000000-8          2799881        125497         -95.52%
BenchmarkBucketSeriesSkipChunks/1SeriesWith10000000Samples/10000000of10000000-8     29997992       373188         -98.76%
  1. Normal series query

The expected performance of this should be almost the same or a little bit slower than before. However, this pr is about 10% - 20% slower when there are a lot of samples in one series. Not sure how to improve this case. I think there is no extra overhead except we are using more if-else statements than before when decoding the index.

➜  thanos git:(improve-index-valid) ✗ benchcmp b-old c-13
benchcmp is deprecated in favor of benchstat: https://pkg.go.dev/golang.org/x/perf/cmd/benchstat
benchmark                                                                 old ns/op      new ns/op      delta
BenchmarkBucketSeries/1000000SeriesWith1Samples/1of1000000-8              317048216      274136169      -13.53%
BenchmarkBucketSeries/1000000SeriesWith1Samples/10of1000000-8             316798190      278900447      -11.96%
BenchmarkBucketSeries/1000000SeriesWith1Samples/1000000of1000000-8        1535423045     1519882954     -1.01%
BenchmarkBucketSeries/100000SeriesWith100Samples/1of10000000-8            25975122       23109618       -11.03%
BenchmarkBucketSeries/100000SeriesWith100Samples/100of10000000-8          25753028       23400939       -9.13%
BenchmarkBucketSeries/100000SeriesWith100Samples/10000000of10000000-8     138522971      137316413      -0.87%
BenchmarkBucketSeries/1SeriesWith10000000Samples/1of10000000-8            2516395        129818         -94.84%
BenchmarkBucketSeries/1SeriesWith10000000Samples/100of10000000-8          2500083        152185         -93.91%
BenchmarkBucketSeries/1SeriesWith10000000Samples/10000000of10000000-8     33321908       36184470       +8.59%

@bwplotka
Copy link
Member

Running benchmarks is tricky but there are ways to make it reliable and more trustworthy.

  • If you run Goland ID you have benchtime only 1s, so this one thing to improve is to increase this time. You can to that by edit config and setting extra program -test.benchtime=10s
  • The other must have flag is to either set benchmem or specify in the code b.ReportAllocs()
  • Useful thing is to limit to 1 or 2 CPUs, because that's the usual value we give for Store GW.

@bwplotka
Copy link
Member

Anyway good work I think it makes sense, wonder if we can clean decodeWithReq a bit more 🤗

Some reference for benchmarks: https://gist.github.com/bwplotka/3b853c31ed11e77c975b9df45d105d74

@yeya24
Copy link
Contributor Author

yeya24 commented Oct 28, 2020

I finished the benchmarks and the new results between this pr and the current master are below. Seems it is not very good when there are not many samples per series.

  1. SkipChunks
name                                                                  old time/op    new time/op    delta
BucketSeriesSkipChunks/1000000SeriesWith1Samples/1of1000000              423ms ± 0%     442ms ± 0%   +4.32%
BucketSeriesSkipChunks/1000000SeriesWith1Samples/10of1000000             496ms ± 0%     458ms ± 0%   -7.56%
BucketSeriesSkipChunks/1000000SeriesWith1Samples/1000000of1000000        3.39s ± 0%     3.31s ± 0%   -2.29%
BucketSeriesSkipChunks/100000SeriesWith100Samples/1of10000000           38.6ms ± 0%    36.6ms ± 0%   -5.18%
BucketSeriesSkipChunks/100000SeriesWith100Samples/100of10000000         36.1ms ± 0%    42.3ms ± 0%  +17.31%
BucketSeriesSkipChunks/100000SeriesWith100Samples/10000000of10000000     276ms ± 0%     278ms ± 0%   +0.76%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/1of10000000           4.53ms ± 0%    0.13ms ± 0%  -97.06%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/100of10000000         4.13ms ± 0%    0.13ms ± 0%  -96.79%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/10000000of10000000    16.5ms ± 0%     0.4ms ± 0%  -97.48%

name                                                                  old alloc/op   new alloc/op   delta
BucketSeriesSkipChunks/1000000SeriesWith1Samples/1of1000000             98.2MB ± 0%    98.2MB ± 0%   -0.00%
BucketSeriesSkipChunks/1000000SeriesWith1Samples/10of1000000            98.2MB ± 0%    98.2MB ± 0%   -0.00%
BucketSeriesSkipChunks/1000000SeriesWith1Samples/1000000of1000000       1.00GB ± 0%    1.00GB ± 0%   -0.00%
BucketSeriesSkipChunks/100000SeriesWith100Samples/1of10000000           8.43MB ± 0%    8.43MB ± 0%   +0.00%
BucketSeriesSkipChunks/100000SeriesWith100Samples/100of10000000         8.43MB ± 0%    8.43MB ± 0%   -0.00%
BucketSeriesSkipChunks/100000SeriesWith100Samples/10000000of10000000    95.9MB ± 0%    95.9MB ± 0%   -0.00%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/1of10000000           4.34MB ± 0%    0.17MB ± 0%  -96.00%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/100of10000000         4.34MB ± 0%    0.17MB ± 0%  -96.00%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/10000000of10000000    17.3MB ± 0%     0.7MB ± 0%  -96.06%

name                                                                  old allocs/op  new allocs/op  delta
BucketSeriesSkipChunks/1000000SeriesWith1Samples/1of1000000               760k ± 0%      760k ± 0%   -0.00%
BucketSeriesSkipChunks/1000000SeriesWith1Samples/10of1000000              760k ± 0%      760k ± 0%   -0.00%
BucketSeriesSkipChunks/1000000SeriesWith1Samples/1000000of1000000        8.04M ± 0%     8.04M ± 0%   -0.00%
BucketSeriesSkipChunks/100000SeriesWith100Samples/1of10000000            76.1k ± 0%     76.1k ± 0%    0.00%
BucketSeriesSkipChunks/100000SeriesWith100Samples/100of10000000          76.1k ± 0%     76.1k ± 0%   -0.00%
BucketSeriesSkipChunks/100000SeriesWith100Samples/10000000of10000000      804k ± 0%      804k ± 0%   -0.00%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/1of10000000              195 ± 0%       172 ± 0%  -11.79%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/100of10000000            195 ± 0%       172 ± 0%  -11.79%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/10000000of10000000       543 ± 0%       445 ± 0%  -18.05%
  1. Normal query
name                                                        old time/op    new time/op    delta
BucketSeries/1000000SeriesWith1Samples/1of1000000              451ms ± 0%     432ms ± 0%   -4.07%
BucketSeries/1000000SeriesWith1Samples/10of1000000             450ms ± 0%     450ms ± 0%   -0.09%
BucketSeries/1000000SeriesWith1Samples/1000000of1000000        3.82s ± 0%     4.27s ± 0%  +11.66%
BucketSeries/100000SeriesWith100Samples/1of10000000           36.7ms ± 0%    36.2ms ± 0%   -1.34%
BucketSeries/100000SeriesWith100Samples/100of10000000         36.1ms ± 0%    38.0ms ± 0%   +5.44%
BucketSeries/100000SeriesWith100Samples/10000000of10000000     321ms ± 0%     359ms ± 0%  +12.04%
BucketSeries/1SeriesWith10000000Samples/1of10000000           5.07ms ± 0%    0.17ms ± 0%  -96.56%
BucketSeries/1SeriesWith10000000Samples/100of10000000         4.80ms ± 0%    0.17ms ± 0%  -96.51%
BucketSeries/1SeriesWith10000000Samples/10000000of10000000    77.6ms ± 0%    81.6ms ± 0%   +5.08%

name                                                        old alloc/op   new alloc/op   delta
BucketSeries/1000000SeriesWith1Samples/1of1000000              117MB ± 0%      98MB ± 0%  -16.35%
BucketSeries/1000000SeriesWith1Samples/10of1000000             117MB ± 0%      98MB ± 0%  -16.35%
BucketSeries/1000000SeriesWith1Samples/1000000of1000000       1.31GB ± 0%    1.31GB ± 0%   +0.18%
BucketSeries/100000SeriesWith100Samples/1of10000000           10.4MB ± 0%     8.5MB ± 0%  -18.46%
BucketSeries/100000SeriesWith100Samples/100of10000000         10.4MB ± 0%     8.4MB ± 0%  -18.55%
BucketSeries/100000SeriesWith100Samples/10000000of10000000     126MB ± 0%     126MB ± 0%   +0.38%
BucketSeries/1SeriesWith10000000Samples/1of10000000           5.90MB ± 0%    0.22MB ± 0%  -96.19%
BucketSeries/1SeriesWith10000000Samples/100of10000000         5.90MB ± 0%    0.22MB ± 0%  -96.19%
BucketSeries/1SeriesWith10000000Samples/10000000of10000000    38.0MB ± 0%    38.0MB ± 0%   -0.00%

name                                                        old allocs/op  new allocs/op  delta
BucketSeries/1000000SeriesWith1Samples/1of1000000              1.26M ± 0%     0.76M ± 0%  -39.69%
BucketSeries/1000000SeriesWith1Samples/10of1000000             1.26M ± 0%     0.76M ± 0%  -39.69%
BucketSeries/1000000SeriesWith1Samples/1000000of1000000        12.1M ± 0%     12.1M ± 0%   -0.00%
BucketSeries/100000SeriesWith100Samples/1of10000000             126k ± 0%       76k ± 0%  -39.65%
BucketSeries/100000SeriesWith100Samples/100of10000000           126k ± 0%       76k ± 0%  -39.64%
BucketSeries/100000SeriesWith100Samples/10000000of10000000     1.21M ± 0%     1.21M ± 0%   -0.00%
BucketSeries/1SeriesWith10000000Samples/1of10000000              232 ± 0%       205 ± 0%  -11.64%
BucketSeries/1SeriesWith10000000Samples/100of10000000            231 ± 0%       205 ± 0%  -11.26%
BucketSeries/1SeriesWith10000000Samples/10000000of10000000      170k ± 0%      170k ± 0%   -0.00%

@yeya24
Copy link
Contributor Author

yeya24 commented Oct 28, 2020

After one more optimization, I think the performance is good now. That's the result:

  1. SkipChunks
name                                                                  old time/op    new time/op    delta
BucketSeriesSkipChunks/1000000SeriesWith1Samples/1of1000000              423ms ± 0%     343ms ± 0%  -19.11%
BucketSeriesSkipChunks/1000000SeriesWith1Samples/10of1000000             496ms ± 0%     345ms ± 0%  -30.38%
BucketSeriesSkipChunks/1000000SeriesWith1Samples/1000000of1000000        3.39s ± 0%     2.85s ± 0%  -15.84%
BucketSeriesSkipChunks/100000SeriesWith100Samples/1of10000000           38.6ms ± 0%    26.6ms ± 0%  -31.15%
BucketSeriesSkipChunks/100000SeriesWith100Samples/100of10000000         36.1ms ± 0%    26.8ms ± 0%  -25.80%
BucketSeriesSkipChunks/100000SeriesWith100Samples/10000000of10000000     276ms ± 0%     257ms ± 0%   -6.70%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/1of10000000           4.53ms ± 0%    0.12ms ± 0%  -97.27%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/100of10000000         4.13ms ± 0%    0.12ms ± 0%  -96.98%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/10000000of10000000    16.5ms ± 0%     0.4ms ± 0%  -97.48%

name                                                                  old alloc/op   new alloc/op   delta
BucketSeriesSkipChunks/1000000SeriesWith1Samples/1of1000000             98.2MB ± 0%    74.2MB ± 0%  -24.44%
BucketSeriesSkipChunks/1000000SeriesWith1Samples/10of1000000            98.2MB ± 0%    74.2MB ± 0%  -24.44%
BucketSeriesSkipChunks/1000000SeriesWith1Samples/1000000of1000000       1.00GB ± 0%    1.00GB ± 0%   +0.00%
BucketSeriesSkipChunks/100000SeriesWith100Samples/1of10000000           8.43MB ± 0%    6.03MB ± 0%  -28.48%
BucketSeriesSkipChunks/100000SeriesWith100Samples/100of10000000         8.43MB ± 0%    6.03MB ± 0%  -28.47%
BucketSeriesSkipChunks/100000SeriesWith100Samples/10000000of10000000    95.9MB ± 0%    95.9MB ± 0%   -0.00%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/1of10000000           4.34MB ± 0%    0.17MB ± 0%  -96.00%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/100of10000000         4.34MB ± 0%    0.17MB ± 0%  -96.00%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/10000000of10000000    17.3MB ± 0%     0.7MB ± 0%  -96.06%

name                                                                  old allocs/op  new allocs/op  delta
BucketSeriesSkipChunks/1000000SeriesWith1Samples/1of1000000               760k ± 0%      510k ± 0%  -32.91%
BucketSeriesSkipChunks/1000000SeriesWith1Samples/10of1000000              760k ± 0%      510k ± 0%  -32.91%
BucketSeriesSkipChunks/1000000SeriesWith1Samples/1000000of1000000        8.04M ± 0%     8.04M ± 0%   +0.00%
BucketSeriesSkipChunks/100000SeriesWith100Samples/1of10000000            76.1k ± 0%     51.1k ± 0%  -32.86%
BucketSeriesSkipChunks/100000SeriesWith100Samples/100of10000000          76.1k ± 0%     51.1k ± 0%  -32.85%
BucketSeriesSkipChunks/100000SeriesWith100Samples/10000000of10000000      804k ± 0%      804k ± 0%   -0.00%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/1of10000000              195 ± 0%       172 ± 0%  -11.79%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/100of10000000            195 ± 0%       172 ± 0%  -11.79%
BucketSeriesSkipChunks/1SeriesWith10000000Samples/10000000of10000000       543 ± 0%       445 ± 0%  -18.05%
  1. Normal series query
name                                                        old time/op    new time/op    delta
BucketSeries/1000000SeriesWith1Samples/1of1000000              451ms ± 0%     340ms ± 0%  -24.52%
BucketSeries/1000000SeriesWith1Samples/10of1000000             450ms ± 0%     345ms ± 0%  -23.46%
BucketSeries/1000000SeriesWith1Samples/1000000of1000000        3.82s ± 0%     3.85s ± 0%   +0.60%
BucketSeries/100000SeriesWith100Samples/1of10000000           36.7ms ± 0%    26.8ms ± 0%  -26.94%
BucketSeries/100000SeriesWith100Samples/100of10000000         36.1ms ± 0%    26.4ms ± 0%  -26.70%
BucketSeries/100000SeriesWith100Samples/10000000of10000000     321ms ± 0%     314ms ± 0%   -2.23%
BucketSeries/1SeriesWith10000000Samples/1of10000000           5.07ms ± 0%    0.15ms ± 0%  -97.03%
BucketSeries/1SeriesWith10000000Samples/100of10000000         4.80ms ± 0%    0.15ms ± 0%  -96.87%
BucketSeries/1SeriesWith10000000Samples/10000000of10000000    77.6ms ± 0%    70.5ms ± 0%   -9.23%

name                                                        old alloc/op   new alloc/op   delta
BucketSeries/1000000SeriesWith1Samples/1of1000000              117MB ± 0%      74MB ± 0%  -36.79%
BucketSeries/1000000SeriesWith1Samples/10of1000000             117MB ± 0%      74MB ± 0%  -36.80%
BucketSeries/1000000SeriesWith1Samples/1000000of1000000       1.31GB ± 0%    1.31GB ± 0%   +0.53%
BucketSeries/100000SeriesWith100Samples/1of10000000           10.4MB ± 0%     6.1MB ± 0%  -41.54%
BucketSeries/100000SeriesWith100Samples/100of10000000         10.4MB ± 0%     6.0MB ± 0%  -41.73%
BucketSeries/100000SeriesWith100Samples/10000000of10000000     126MB ± 0%     126MB ± 0%   +0.07%
BucketSeries/1SeriesWith10000000Samples/1of10000000           5.90MB ± 0%    0.22MB ± 0%  -96.19%
BucketSeries/1SeriesWith10000000Samples/100of10000000         5.90MB ± 0%    0.22MB ± 0%  -96.19%
BucketSeries/1SeriesWith10000000Samples/10000000of10000000    38.0MB ± 0%    38.0MB ± 0%   +0.00%

name                                                        old allocs/op  new allocs/op  delta
BucketSeries/1000000SeriesWith1Samples/1of1000000              1.26M ± 0%     0.51M ± 0%  -59.54%
BucketSeries/1000000SeriesWith1Samples/10of1000000             1.26M ± 0%     0.51M ± 0%  -59.53%
BucketSeries/1000000SeriesWith1Samples/1000000of1000000        12.1M ± 0%     12.1M ± 0%   +0.00%
BucketSeries/100000SeriesWith100Samples/1of10000000             126k ± 0%       51k ± 0%  -59.48%
BucketSeries/100000SeriesWith100Samples/100of10000000           126k ± 0%       51k ± 0%  -59.46%
BucketSeries/100000SeriesWith100Samples/10000000of10000000     1.21M ± 0%     1.21M ± 0%   +0.00%
BucketSeries/1SeriesWith10000000Samples/1of10000000              232 ± 0%       205 ± 0%  -11.64%
BucketSeries/1SeriesWith10000000Samples/100of10000000            231 ± 0%       205 ± 0%  -11.26%
BucketSeries/1SeriesWith10000000Samples/10000000of10000000      170k ± 0%      170k ± 0%   +0.00%

@yeya24 yeya24 marked this pull request as ready for review October 28, 2020 23:03
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work! LGTM 💪

@bwplotka bwplotka merged commit 262efad into thanos-io:master Nov 4, 2020
@pracucci pracucci mentioned this pull request Nov 12, 2020
2 tasks
Comment on lines +2082 to +2083
ref0 += d.Varint64()
t0 = maxt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be done before if maxt < req.MinTime, otherwise the next for iteration will read invalid data. This PR #3440 fixes it.

Oghenebrume50 pushed a commit to Oghenebrume50/thanos that referenced this pull request Dec 7, 2020
…s-io#3367)

* check bucket series time range when decoding chunks from index

Signed-off-by: Ben Ye <[email protected]>

* reduce alloc

Signed-off-by: Ben Ye <[email protected]>
Signed-off-by: Oghenebrume50 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants