Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synchronous iterator #2165

Merged
merged 24 commits into from
Apr 25, 2023
Merged

Synchronous iterator #2165

merged 24 commits into from
Apr 25, 2023

Conversation

mdisibio
Copy link
Contributor

@mdisibio mdisibio commented Mar 3, 2023

What this PR does:
This PR creates a new synchronous version of the ColumnIterator. Current tests are passing and performance is looking better, but needs a bit more polish and testing. Need to run it against the enhanced set of tests in #2119.

name                                 old time/op    new time/op    delta
BackendBlockTraceQL/noMatch-12         33.8ms ± 2%    31.6ms ± 2%   -6.37%  (p=0.000 n=9+10)
BackendBlockTraceQL/partialMatch-12    53.4ms ±14%    17.0ms ± 1%  -68.19%  (p=0.000 n=10+9)
BackendBlockTraceQL/service.name-12    2.48ms ± 1%    2.43ms ± 3%   -1.94%  (p=0.001 n=10+8)

name                                 old speed      new speed      delta
BackendBlockTraceQL/noMatch-12        505MB/s ± 2%   539MB/s ± 2%   +6.81%  (p=0.000 n=9+10)
BackendBlockTraceQL/partialMatch-12   462MB/s ±13%   667MB/s ± 1%  +44.39%  (p=0.000 n=10+9)
BackendBlockTraceQL/service.name-12   293MB/s ± 1%   299MB/s ± 3%   +1.99%  (p=0.001 n=10+8)

name                                 old MB_io/op   new MB_io/op   delta
BackendBlockTraceQL/noMatch-12           17.0 ± 0%      17.0 ± 0%     ~     (all equal)
BackendBlockTraceQL/partialMatch-12      24.5 ± 0%      11.3 ± 0%  -53.85%  (p=0.000 n=10+10)
BackendBlockTraceQL/service.name-12      0.73 ± 0%      0.73 ± 0%     ~     (all equal)

name                                 old alloc/op   new alloc/op   delta
BackendBlockTraceQL/noMatch-12         2.25MB ± 4%    2.23MB ± 6%     ~     (p=0.400 n=9+10)
BackendBlockTraceQL/partialMatch-12    46.4MB ± 0%    23.4MB ± 1%  -49.52%  (p=0.000 n=10+10)
BackendBlockTraceQL/service.name-12     234kB ± 1%     221kB ± 2%   -5.78%  (p=0.000 n=10+10)

name                                 old allocs/op  new allocs/op  delta
BackendBlockTraceQL/noMatch-12          31.3k ± 0%     31.2k ± 0%   -0.12%  (p=0.000 n=10+8)
BackendBlockTraceQL/partialMatch-12     72.0k ± 0%     49.7k ± 0%  -30.94%  (p=0.000 n=10+10)
BackendBlockTraceQL/service.name-12     3.36k ± 0%     3.32k ± 0%   -1.01%  (p=0.000 n=10+10)

Which issue(s) this PR fixes:
n/a

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@knylander-grafana
Copy link
Contributor

@mdisibio Will we need doc for this?

@joe-elliott
Copy link
Member

Can we resurrect this PR? I keep seeing traces like this:

image

Issue here: #2336

This PR would eliminate the possibility that the large gaps seen here are due to synchronization. It would also hopefully allow for easier performance analysis by simplifying this bit of code.

@mdisibio
Copy link
Contributor Author

Can we resurrect this PR?

Yes. A couple updates:

  1. What do you think about putting this behind an environment variable like PARQUET_SYNC_ITERATOR=1? So we can safely merge and test, and also compare performance between pods.
  2. There is still a bug or two that surfaced when using this for the new metrics stuff on real blocks, which isn't caught by the current test suite. Hopefully a simple fix.

@joe-elliott
Copy link
Member

What do you think about putting this behind an environment variable like PARQUET_SYNC_ITERATOR=1? So we can safely merge and test, and also compare performance between pods.

no concerns. do what makes sense to you. is the PR that risky? or are the gains not very clear?

There is still a bug or two that surfaced when using this for the new metrics stuff on real blocks, which isn't caught by the current test suite. Hopefully a simple fix.

👍

@mdisibio
Copy link
Contributor Author

mdisibio commented Apr 19, 2023

@mdisibio Will we need doc for this?

I don't think so. This is a performance experiment and if it pans out we will make it permanent. No user-facing changes.

is the PR that risky? or are the gains not very clear?

A little of both but mostly the former. Concerned about correctness because although our test suite is fairly comprehensive, there are execution paths that are hard to cover synthetically (like jumping between pages, etc). The gains are shaping up better after the last batch of changes, but still not 100% beneficial.

Here are some benchmarks on the wider traceql test suite. Executed by VPARQUET_SYNC_ITERATOR=1 go test ...:

name                                                old time/op    new time/op     delta
BackendBlockTraceQL/spanAttNameNoMatch-12             13.5ms ± 4%     11.3ms ± 4%   -16.29%  (p=0.008 n=5+5)
BackendBlockTraceQL/spanAttValNoMatch-12              41.9ms ± 3%     38.9ms ± 1%    -7.05%  (p=0.008 n=5+5)
BackendBlockTraceQL/spanAttValMatch-12                 125ms ±11%       86ms ± 4%   -31.05%  (p=0.008 n=5+5)
BackendBlockTraceQL/spanAttIntrinsicNoMatch-12        11.2ms ± 7%     10.0ms ±11%      ~     (p=0.056 n=5+5)
BackendBlockTraceQL/spanAttIntrinsicMatch-12           566ms ± 8%      314ms ± 8%   -44.61%  (p=0.008 n=5+5)
BackendBlockTraceQL/resourceAttNameNoMatch-12         2.97ms ±15%     2.52ms ± 6%   -14.97%  (p=0.008 n=5+5)
BackendBlockTraceQL/resourceAttValNoMatch-12          2.84ms ± 2%     2.42ms ± 2%   -14.88%  (p=0.008 n=5+5)
BackendBlockTraceQL/resourceAttValMatch-12            2.86ms ± 2%     2.77ms ±20%      ~     (p=0.548 n=5+5)
BackendBlockTraceQL/resourceAttIntrinsicNoMatch-12    2.44ms ± 4%     2.38ms ± 4%      ~     (p=0.310 n=5+5)
BackendBlockTraceQL/resourceAttIntrinsicMatch-12       685ms ±33%      227ms ± 7%   -66.83%  (p=0.008 n=5+5)
BackendBlockTraceQL/mixedNameNoMatch-12                2.70s ± 6%      1.72s ± 1%   -36.21%  (p=0.008 n=5+5)
BackendBlockTraceQL/mixedValNoMatch-12                 1.67s ±10%      1.32s ± 1%   -21.01%  (p=0.008 n=5+5)
BackendBlockTraceQL/mixedValMixedMatchAnd-12          2.90ms ±35%     3.32ms ±42%      ~     (p=1.000 n=5+5)
BackendBlockTraceQL/mixedValMixedMatchOr-12            1.97s ±11%      1.53s ± 3%   -22.55%  (p=0.008 n=5+5)
BackendBlockTraceQL/mixedValBothMatch-12              2.38ms ±11%     2.44ms ± 8%      ~     (p=0.421 n=5+5)

name                                                old speed      new speed       delta
BackendBlockTraceQL/spanAttNameNoMatch-12            479MB/s ± 4%    572MB/s ± 3%   +19.43%  (p=0.008 n=5+5)
BackendBlockTraceQL/spanAttValNoMatch-12             297MB/s ± 3%    320MB/s ± 1%    +7.56%  (p=0.008 n=5+5)
BackendBlockTraceQL/spanAttValMatch-12              59.9MB/s ±10%   86.7MB/s ± 4%   +44.64%  (p=0.008 n=5+5)
BackendBlockTraceQL/spanAttIntrinsicNoMatch-12       544MB/s ± 7%    606MB/s ±10%      ~     (p=0.056 n=5+5)
BackendBlockTraceQL/spanAttIntrinsicMatch-12         147MB/s ± 8%    266MB/s ± 8%   +80.77%  (p=0.008 n=5+5)
BackendBlockTraceQL/resourceAttNameNoMatch-12        461MB/s ±14%    540MB/s ± 5%   +17.05%  (p=0.008 n=5+5)
BackendBlockTraceQL/resourceAttValNoMatch-12         479MB/s ± 2%    562MB/s ± 2%   +17.47%  (p=0.008 n=5+5)
BackendBlockTraceQL/resourceAttValMatch-12           476MB/s ± 2%    497MB/s ±18%      ~     (p=0.548 n=5+5)
BackendBlockTraceQL/resourceAttIntrinsicNoMatch-12   128MB/s ± 4%    131MB/s ± 4%      ~     (p=0.310 n=5+5)
BackendBlockTraceQL/resourceAttIntrinsicMatch-12     115MB/s ±27%    335MB/s ± 6%  +192.53%  (p=0.008 n=5+5)
BackendBlockTraceQL/mixedNameNoMatch-12             9.58MB/s ± 6%  14.99MB/s ± 1%   +56.48%  (p=0.008 n=5+5)
BackendBlockTraceQL/mixedValNoMatch-12              15.5MB/s ± 9%   19.6MB/s ± 1%   +26.18%  (p=0.008 n=5+5)
BackendBlockTraceQL/mixedValMixedMatchAnd-12         481MB/s ±28%    436MB/s ±34%      ~     (p=1.000 n=5+5)
BackendBlockTraceQL/mixedValMixedMatchOr-12         47.0MB/s ±10%   60.5MB/s ± 3%   +28.75%  (p=0.008 n=5+5)
BackendBlockTraceQL/mixedValBothMatch-12             132MB/s ±10%    128MB/s ± 8%      ~     (p=0.421 n=5+5)

name                                                old MB_io/op   new MB_io/op    delta
BackendBlockTraceQL/spanAttNameNoMatch-12               6.44 ± 0%       6.44 ± 0%      ~     (all equal)
BackendBlockTraceQL/spanAttValNoMatch-12                12.4 ± 0%       12.4 ± 0%      ~     (all equal)
BackendBlockTraceQL/spanAttValMatch-12                  7.45 ± 0%       7.45 ± 0%      ~     (all equal)
BackendBlockTraceQL/spanAttIntrinsicNoMatch-12          6.06 ± 0%       6.06 ± 0%      ~     (all equal)
BackendBlockTraceQL/spanAttIntrinsicMatch-12            83.1 ± 0%       83.1 ± 0%      ~     (all equal)
BackendBlockTraceQL/resourceAttNameNoMatch-12           1.36 ± 0%       1.36 ± 0%      ~     (all equal)
BackendBlockTraceQL/resourceAttValNoMatch-12            1.36 ± 0%       1.36 ± 0%      ~     (all equal)
BackendBlockTraceQL/resourceAttValMatch-12              1.36 ± 0%       1.36 ± 0%      ~     (all equal)
BackendBlockTraceQL/resourceAttIntrinsicNoMatch-12      0.31 ± 0%       0.31 ± 0%      ~     (all equal)
BackendBlockTraceQL/resourceAttIntrinsicMatch-12        76.1 ± 0%       76.1 ± 0%      ~     (all equal)
BackendBlockTraceQL/mixedNameNoMatch-12                 25.8 ± 0%       25.8 ± 0%      ~     (all equal)
BackendBlockTraceQL/mixedValNoMatch-12                  25.8 ± 0%       25.8 ± 0%      ~     (all equal)
BackendBlockTraceQL/mixedValMixedMatchAnd-12            1.36 ± 0%       1.36 ± 0%      ~     (all equal)
BackendBlockTraceQL/mixedValMixedMatchOr-12             92.3 ± 0%       92.3 ± 0%      ~     (all equal)
BackendBlockTraceQL/mixedValBothMatch-12                0.31 ± 0%       0.31 ± 0%      ~     (all equal)

name                                                old alloc/op   new alloc/op    delta
BackendBlockTraceQL/spanAttNameNoMatch-12             1.27MB ±14%     1.27MB ±23%      ~     (p=0.690 n=5+5)
BackendBlockTraceQL/spanAttValNoMatch-12              16.1MB ± 1%     17.6MB ± 5%    +9.84%  (p=0.008 n=5+5)
BackendBlockTraceQL/spanAttValMatch-12                7.10MB ± 4%     5.86MB ±15%   -17.45%  (p=0.008 n=5+5)
BackendBlockTraceQL/spanAttIntrinsicNoMatch-12        1.13MB ± 7%     1.36MB ±33%      ~     (p=0.151 n=5+5)
BackendBlockTraceQL/spanAttIntrinsicMatch-12          86.3MB ± 8%     30.3MB ±19%   -64.85%  (p=0.008 n=5+5)
BackendBlockTraceQL/resourceAttNameNoMatch-12          850kB ± 1%     1053kB ±16%   +23.89%  (p=0.008 n=5+5)
BackendBlockTraceQL/resourceAttValNoMatch-12           848kB ± 1%      979kB ± 3%   +15.39%  (p=0.008 n=5+5)
BackendBlockTraceQL/resourceAttValMatch-12             852kB ± 0%     1071kB ± 9%   +25.76%  (p=0.008 n=5+5)
BackendBlockTraceQL/resourceAttIntrinsicNoMatch-12     852kB ± 1%      983kB ± 7%   +15.32%  (p=0.008 n=5+5)
BackendBlockTraceQL/resourceAttIntrinsicMatch-12       117MB ±10%       83MB ± 7%   -29.16%  (p=0.008 n=5+5)
BackendBlockTraceQL/mixedNameNoMatch-12               31.1MB ±37%     29.9MB ±80%      ~     (p=0.381 n=5+5)
BackendBlockTraceQL/mixedValNoMatch-12                40.7MB ±36%     36.4MB ±51%      ~     (p=1.000 n=5+5)
BackendBlockTraceQL/mixedValMixedMatchAnd-12           844kB ± 1%     1039kB ±10%   +23.13%  (p=0.008 n=5+5)
BackendBlockTraceQL/mixedValMixedMatchOr-12            125MB ±49%      57MB ±128%      ~     (p=0.056 n=5+5)
BackendBlockTraceQL/mixedValBothMatch-12               850kB ± 0%      985kB ± 7%   +15.85%  (p=0.008 n=5+5)

name                                                old allocs/op  new allocs/op   delta
BackendBlockTraceQL/spanAttNameNoMatch-12              10.6k ± 0%      10.6k ± 0%    -0.34%  (p=0.008 n=5+5)
BackendBlockTraceQL/spanAttValNoMatch-12               11.1k ± 0%      10.9k ± 0%    -1.75%  (p=0.008 n=5+5)
BackendBlockTraceQL/spanAttValMatch-12                 12.0k ± 0%      11.9k ± 0%    -1.13%  (p=0.008 n=5+5)
BackendBlockTraceQL/spanAttIntrinsicNoMatch-12         10.6k ± 0%      10.6k ± 0%    -0.29%  (p=0.008 n=5+5)
BackendBlockTraceQL/spanAttIntrinsicMatch-12           5.13M ± 0%      0.05M ± 0%   -98.99%  (p=0.008 n=5+5)
BackendBlockTraceQL/resourceAttNameNoMatch-12          10.5k ± 0%      10.5k ± 0%    -0.33%  (p=0.008 n=5+5)
BackendBlockTraceQL/resourceAttValNoMatch-12           10.5k ± 0%      10.5k ± 0%    -0.34%  (p=0.008 n=5+5)
BackendBlockTraceQL/resourceAttValMatch-12             10.5k ± 0%      10.5k ± 0%    -0.33%  (p=0.008 n=5+5)
BackendBlockTraceQL/resourceAttIntrinsicNoMatch-12     10.6k ± 0%      10.5k ± 0%    -0.28%  (p=0.000 n=5+4)
BackendBlockTraceQL/resourceAttIntrinsicMatch-12       4.31M ± 0%      0.02M ± 0%   -99.54%  (p=0.008 n=5+5)
BackendBlockTraceQL/mixedNameNoMatch-12                12.1k ± 0%      12.6k ± 7%      ~     (p=0.730 n=4+5)
BackendBlockTraceQL/mixedValNoMatch-12                 16.2k ± 2%      16.3k ± 8%      ~     (p=0.548 n=5+5)
BackendBlockTraceQL/mixedValMixedMatchAnd-12           10.5k ± 0%      10.5k ± 0%    -0.33%  (p=0.000 n=5+4)
BackendBlockTraceQL/mixedValMixedMatchOr-12            5.13M ± 0%      0.05M ± 0%   -98.97%  (p=0.029 n=4+4)
BackendBlockTraceQL/mixedValBothMatch-12               10.6k ± 0%      10.5k ± 0%    -0.28%  (p=0.000 n=5+4)

@mdisibio mdisibio marked this pull request as ready for review April 19, 2023 17:59
@mdisibio mdisibio changed the title [WIP] Synchronous iterator Synchronous iterator Apr 19, 2023
Copy link
Member

@joe-elliott joe-elliott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. some small comments, but excited to get this one in.

var _ Iterator = (*SyncIterator)(nil)

func NewSyncIterator(ctx context.Context, rgs []pq.RowGroup, column int, columnName string, readSize int, filter Predicate, selectAs string) *SyncIterator {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ctx not used? it looks like we're losing the columnIterator.iterate span with this change which i find somewhat useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, the span and inspected/kept tags are useful. It's a little atypical but we could start a span in NewSyncIterator and then finish it in iter.Close(). Any other ideas? There's no longer an over-arching method like iterate, and Next/SeekTo are too fine-grained.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oof, good point. our traces need attention generally and this PR is too obvious a win to block.

fine with the "NewSyncIterator" approach to see how it goes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Tested it out and it works well enough for a first pass or if we have a better idea. The side-effect is that the span starts as soon as the iter is created, instead of on the first "pull" like the async one.

pkg/parquetquery/iters.go Show resolved Hide resolved
@mdisibio
Copy link
Contributor Author

Pushed a small fix to mergeSpanSetIterator to Close() iters as soon as possible when they are exhausted, instead of waiting all the way til the end. Mostly for timing accuracy, performance not really affected.

@mdisibio mdisibio merged commit 9996433 into grafana:main Apr 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants