Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maps are slow #3113

Merged
merged 6 commits into from
Nov 17, 2023
Merged

Maps are slow #3113

merged 6 commits into from
Nov 17, 2023

Conversation

joe-elliott
Copy link
Member

@joe-elliott joe-elliott commented Nov 3, 2023

This PR improves performance of queries that hit the engine layer by reducing allocations and map iterations. It swaps a map of all attributes in the span struct for slices at the span, resource and trace levels.

The AttributeFor function is quite important. It tests the appropriate slices if a scope is provided. Otherwise it checks the slices by attribute name only in order of precedence. This is to replicate this behavior which used to exist in the engine.

Other Changes

  • Added an AttributeFor() and AllAttributes() method. Unfortunately the building of trace metadata requires getting all of the attributes and iterating through them. The AllAttributes() method has terrible performance and should generally not be called.
  • Due to these changes the behavior in two spots was subtly changed:
    • ExecuteTagValues used to check the map directly for a match. Now it calls the AttributeFor() method. This is more permissive as it first checks for direct matches and then extends its search to check the individual levels by name.
    • Similar concerns as above in traceqlmetrics
  • An interesting difference worth noting is that if you set the same attribute twice the slice will hold two values where previously it only held one. The value for an attribute on a span should always be the same so returning the first should be ok, but it may weaken the performance improvements achieved by dropping spans using attributesMatched
  • Benchmarks reduced/updated to include more complex queries that show the impact of the changes.

Benchmarks below!

@joe-elliott
Copy link
Member Author

Benchmarks:

name                                             old time/op    new time/op     delta
BackendBlockTraceQL/spanAttValNoMatch-8            7.04ms ± 2%     7.05ms ± 3%     ~     (p=0.842 n=9+10)
BackendBlockTraceQL/spanAttIntrinsicNoMatch-8      6.05ms ± 2%     6.11ms ± 2%   +0.93%  (p=0.023 n=10+10)
BackendBlockTraceQL/resourceAttValNoMatch-8        12.6ms ± 2%     12.5ms ± 3%     ~     (p=0.278 n=9+10)
BackendBlockTraceQL/resourceAttIntrinsicMatch-8    26.2ms ± 4%     25.5ms ± 4%   -2.33%  (p=0.029 n=10+10)
BackendBlockTraceQL/mixedValNoMatch-8               337ms ± 4%      312ms ± 3%   -7.31%  (p=0.000 n=9+10)
BackendBlockTraceQL/mixedValMixedMatchAnd-8        6.04ms ± 2%     5.95ms ± 1%   -1.55%  (p=0.003 n=10+10)
BackendBlockTraceQL/mixedValMixedMatchOr-8          342ms ± 3%      322ms ± 4%   -5.70%  (p=0.000 n=10+9)
BackendBlockTraceQL/count-8                         603ms ± 3%      565ms ± 3%   -6.34%  (p=0.000 n=10+10)
BackendBlockTraceQL/struct-8                        1.44s ±11%      1.00s ± 9%  -30.49%  (p=0.000 n=10+10)
BackendBlockTraceQL/||-8                            523ms ±12%      336ms ± 6%  -35.69%  (p=0.000 n=10+10)
BackendBlockTraceQL/mixed-8                        44.0ms ± 4%     39.1ms ± 2%  -11.09%  (p=0.000 n=10+10)

name                                             old speed      new speed       delta
BackendBlockTraceQL/spanAttValNoMatch-8           390MB/s ± 2%    389MB/s ± 3%     ~     (p=0.842 n=9+10)
BackendBlockTraceQL/spanAttIntrinsicNoMatch-8     280MB/s ± 1%    277MB/s ± 2%   -0.92%  (p=0.023 n=10+10)
BackendBlockTraceQL/resourceAttValNoMatch-8       176MB/s ± 2%    177MB/s ± 3%     ~     (p=0.278 n=9+10)
BackendBlockTraceQL/resourceAttIntrinsicMatch-8   635MB/s ± 3%    650MB/s ± 4%   +2.39%  (p=0.029 n=10+10)
BackendBlockTraceQL/mixedValNoMatch-8            14.3MB/s ± 4%   15.4MB/s ± 3%   +7.91%  (p=0.000 n=9+10)
BackendBlockTraceQL/mixedValMixedMatchAnd-8       192MB/s ± 2%    195MB/s ± 1%   +1.57%  (p=0.003 n=10+10)
BackendBlockTraceQL/mixedValMixedMatchOr-8       56.3MB/s ± 3%   59.7MB/s ± 3%   +6.05%  (p=0.000 n=10+9)
BackendBlockTraceQL/count-8                      28.5MB/s ± 2%   30.5MB/s ± 3%   +6.77%  (p=0.000 n=10+10)
BackendBlockTraceQL/struct-8                     1.29MB/s ±11%   1.86MB/s ±10%  +43.88%  (p=0.000 n=10+10)
BackendBlockTraceQL/||-8                         34.0MB/s ±11%   52.7MB/s ± 6%  +54.99%  (p=0.000 n=10+10)
BackendBlockTraceQL/mixed-8                      99.2MB/s ± 4%  111.5MB/s ± 2%  +12.45%  (p=0.000 n=10+10)

name                                             old MB_io/op   new MB_io/op    delta
BackendBlockTraceQL/spanAttValNoMatch-8              2.74 ± 0%       2.74 ± 0%     ~     (all equal)
BackendBlockTraceQL/spanAttIntrinsicNoMatch-8        1.69 ± 0%       1.69 ± 0%     ~     (all equal)
BackendBlockTraceQL/resourceAttValNoMatch-8          2.21 ± 0%       2.21 ± 0%     ~     (all equal)
BackendBlockTraceQL/resourceAttIntrinsicMatch-8      16.6 ± 0%       16.6 ± 0%     ~     (all equal)
BackendBlockTraceQL/mixedValNoMatch-8                4.80 ± 0%       4.80 ± 0%     ~     (all equal)
BackendBlockTraceQL/mixedValMixedMatchAnd-8          1.16 ± 0%       1.16 ± 0%     ~     (all equal)
BackendBlockTraceQL/mixedValMixedMatchOr-8           19.2 ± 0%       19.2 ± 0%     ~     (all equal)
BackendBlockTraceQL/count-8                          17.2 ± 0%       17.2 ± 0%     ~     (all equal)
BackendBlockTraceQL/struct-8                         1.85 ± 0%       1.85 ± 0%     ~     (all equal)
BackendBlockTraceQL/||-8                             17.7 ± 0%       17.7 ± 0%     ~     (all equal)
BackendBlockTraceQL/mixed-8                          4.36 ± 0%       4.36 ± 0%     ~     (all equal)

name                                             old alloc/op   new alloc/op    delta
BackendBlockTraceQL/spanAttValNoMatch-8            3.97MB ± 1%     3.98MB ± 2%     ~     (p=0.968 n=9+10)
BackendBlockTraceQL/spanAttIntrinsicNoMatch-8      2.97MB ± 2%     2.98MB ± 2%     ~     (p=0.968 n=10+9)
BackendBlockTraceQL/resourceAttValNoMatch-8        5.45MB ± 2%     5.50MB ± 3%     ~     (p=0.278 n=9+10)
BackendBlockTraceQL/resourceAttIntrinsicMatch-8    3.21MB ± 8%     3.17MB ± 6%     ~     (p=0.631 n=10+10)
BackendBlockTraceQL/mixedValNoMatch-8              7.18MB ±12%     6.72MB ±14%   -6.41%  (p=0.042 n=10+10)
BackendBlockTraceQL/mixedValMixedMatchAnd-8        2.95MB ± 1%     2.94MB ± 3%     ~     (p=0.579 n=10+10)
BackendBlockTraceQL/mixedValMixedMatchOr-8         6.14MB ±20%     5.87MB ± 9%     ~     (p=0.661 n=10+9)
BackendBlockTraceQL/count-8                         274MB ± 0%      306MB ± 2%  +11.74%  (p=0.000 n=10+10)
BackendBlockTraceQL/struct-8                       23.1MB ±12%     23.2MB ±12%     ~     (p=0.446 n=8+10)
BackendBlockTraceQL/||-8                            453MB ± 0%      145MB ± 1%  -68.10%  (p=0.000 n=10+9)
BackendBlockTraceQL/mixed-8                        3.04MB ± 6%     3.06MB ± 7%     ~     (p=0.796 n=10+10)

name                                             old allocs/op  new allocs/op   delta
BackendBlockTraceQL/spanAttValNoMatch-8             44.1k ± 0%      44.1k ± 0%   -0.01%  (p=0.000 n=10+10)
BackendBlockTraceQL/spanAttIntrinsicNoMatch-8       44.0k ± 0%      44.0k ± 0%   -0.01%  (p=0.000 n=10+7)
BackendBlockTraceQL/resourceAttValNoMatch-8         44.2k ± 0%      44.2k ± 0%   -0.01%  (p=0.000 n=9+10)
BackendBlockTraceQL/resourceAttIntrinsicMatch-8     45.6k ± 0%      45.6k ± 0%   -0.03%  (p=0.000 n=10+10)
BackendBlockTraceQL/mixedValNoMatch-8               47.2k ± 0%      47.2k ± 0%   -0.03%  (p=0.001 n=10+10)
BackendBlockTraceQL/mixedValMixedMatchAnd-8         44.1k ± 0%      44.1k ± 0%   -0.01%  (p=0.000 n=10+10)
BackendBlockTraceQL/mixedValMixedMatchOr-8          46.1k ± 0%      46.1k ± 0%   +0.03%  (p=0.006 n=10+10)
BackendBlockTraceQL/count-8                         2.71M ± 0%      1.69M ± 0%  -37.57%  (p=0.000 n=10+10)
BackendBlockTraceQL/struct-8                         250k ± 0%       252k ± 1%   +1.00%  (p=0.000 n=8+10)
BackendBlockTraceQL/||-8                            1.12M ± 0%      0.78M ± 0%  -30.55%  (p=0.000 n=10+10)
BackendBlockTraceQL/mixed-8                         44.9k ± 0%      44.9k ± 0%   -0.04%  (p=0.000 n=10+10)

func (s *span) Attributes() map[traceql.Attribute]traceql.Static {
return s.attributes
func (s *span) AllAttributes() map[traceql.Attribute]traceql.Static {
atts := make(map[traceql.Attribute]traceql.Static, len(s.spanAttrs)+len(s.resourceAttrs)+len(s.traceAttrs))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm not mistaken AllAttributes() is currently only called once per span. I'm still wondering if it makes sense to cache atts in the span and skip the copying for consecutive calls to avoid future performance issues

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't want to cache it b/c it's possible to have attributes change between calls to AllAttributes(), but it is a very expensive call and this PR works only b/c we scan so many more spans then we turn them into the results metadata.

I added a comment here to hopefully prevent abuse of AllAttributes():

https://github.com/grafana/tempo/pull/3113/files#diff-5f08c81d01d0224c73f713724e728ed0ed7794b177abddf09c32c5f3a6055fb8R83-R84

but that's not a great safeguard.

@stoewer
Copy link
Collaborator

stoewer commented Nov 14, 2023

Here are my benchmark results on a 5GB block from ops:

go test -run='^#' -bench=BackendBlockTraceQL -count=10 -benchtime=2s -benchmem ./tempodb/encoding/vparquet3
goos: linux
goarch: amd64
pkg: github.com/grafana/tempo/tempodb/encoding/vparquet3
cpu: AMD Ryzen 7 PRO 6850U with Radeon Graphics     
                                                 │ bench-maps-old-02.txt │        bench-maps-new-02.txt        │
                                                 │        sec/op         │   sec/op     vs base                │
BackendBlockTraceQL/spanAttValNoMatch-16                     11.70m ± 1%   11.86m ± 3%   +1.36% (p=0.000 n=10)
BackendBlockTraceQL/spanAttIntrinsicNoMatch-16               11.16m ± 1%   11.24m ± 1%   +0.74% (p=0.004 n=10)
BackendBlockTraceQL/resourceAttValNoMatch-16                 10.97m ± 0%   11.14m ± 1%   +1.60% (p=0.000 n=10)
BackendBlockTraceQL/resourceAttIntrinsicMatch-16             32.70m ± 1%   31.86m ± 1%   -2.55% (p=0.000 n=10)
BackendBlockTraceQL/mixedValNoMatch-16                       240.8m ± 2%   234.9m ± 2%   -2.46% (p=0.000 n=10)
BackendBlockTraceQL/mixedValMixedMatchAnd-16                 10.95m ± 1%   11.13m ± 1%   +1.64% (p=0.001 n=10)
BackendBlockTraceQL/mixedValMixedMatchOr-16                  256.3m ± 1%   247.5m ± 2%   -3.41% (p=0.000 n=10)
BackendBlockTraceQL/count-16                                 418.6m ± 1%   406.0m ± 1%   -3.02% (p=0.000 n=10)
BackendBlockTraceQL/struct-16                               1055.4m ± 4%   746.0m ± 4%  -29.31% (p=0.000 n=10)
BackendBlockTraceQL/||-16                                    114.5m ± 1%   108.6m ± 1%   -5.08% (p=0.000 n=10)
BackendBlockTraceQL/mixed-16                                 56.29m ± 1%   56.16m ± 1%        ~ (p=0.796 n=10)
geomean                                                      65.17m        62.48m        -4.13%

                                                 │ bench-maps-old-02.txt │        bench-maps-new-02.txt         │
                                                 │          B/s          │     B/s       vs base                │
BackendBlockTraceQL/spanAttValNoMatch-16                    212.6Mi ± 1%   209.8Mi ± 3%   -1.34% (p=0.000 n=10)
BackendBlockTraceQL/spanAttIntrinsicNoMatch-16              215.4Mi ± 1%   213.8Mi ± 1%   -0.74% (p=0.004 n=10)
BackendBlockTraceQL/resourceAttValNoMatch-16                129.1Mi ± 0%   127.1Mi ± 1%   -1.58% (p=0.000 n=10)
BackendBlockTraceQL/resourceAttIntrinsicMatch-16            449.2Mi ± 1%   461.0Mi ± 1%   +2.61% (p=0.000 n=10)
BackendBlockTraceQL/mixedValNoMatch-16                      11.44Mi ± 2%   11.73Mi ± 2%   +2.54% (p=0.000 n=10)
BackendBlockTraceQL/mixedValMixedMatchAnd-16                130.5Mi ± 1%   128.4Mi ± 1%   -1.61% (p=0.001 n=10)
BackendBlockTraceQL/mixedValMixedMatchOr-16                 41.75Mi ± 1%   43.22Mi ± 2%   +3.52% (p=0.000 n=10)
BackendBlockTraceQL/count-16                                37.13Mi ± 1%   38.29Mi ± 1%   +3.11% (p=0.000 n=10)
BackendBlockTraceQL/struct-16                               4.148Mi ± 4%   5.865Mi ± 4%  +41.38% (p=0.000 n=10)
BackendBlockTraceQL/||-16                                   138.9Mi ± 1%   146.4Mi ± 1%   +5.35% (p=0.000 n=10)
BackendBlockTraceQL/mixed-16                                259.4Mi ± 1%   260.0Mi ± 1%        ~ (p=0.781 n=10)
geomean                                                     80.49Mi        83.96Mi        +4.31%

                                                 │ bench-maps-old-02.txt │        bench-maps-new-02.txt        │
                                                 │       MB_io/op        │  MB_io/op   vs base                 │
BackendBlockTraceQL/spanAttValNoMatch-16                      2.609 ± 0%   2.609 ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/spanAttIntrinsicNoMatch-16                2.519 ± 0%   2.519 ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/resourceAttValNoMatch-16                  1.485 ± 0%   1.485 ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/resourceAttIntrinsicMatch-16              15.40 ± 0%   15.40 ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/mixedValNoMatch-16                        2.889 ± 0%   2.889 ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/mixedValMixedMatchAnd-16                  1.499 ± 0%   1.499 ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/mixedValMixedMatchOr-16                   11.22 ± 0%   11.22 ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/count-16                                  16.30 ± 0%   16.30 ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/struct-16                                 4.589 ± 0%   4.589 ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/||-16                                     16.67 ± 0%   16.67 ± 0%       ~ (p=1.000 n=10) ¹
BackendBlockTraceQL/mixed-16                                  15.31 ± 0%   15.31 ± 0%       ~ (p=1.000 n=10) ¹
geomean                                                       5.500        5.500       +0.00%
¹ all samples are equal

                                                 │ bench-maps-old-02.txt │        bench-maps-new-02.txt         │
                                                 │         B/op          │     B/op      vs base                │
BackendBlockTraceQL/spanAttValNoMatch-16                    8.028Mi ± 1%   7.991Mi ± 0%        ~ (p=0.218 n=10)
BackendBlockTraceQL/spanAttIntrinsicNoMatch-16              7.058Mi ± 1%   7.048Mi ± 1%        ~ (p=0.971 n=10)
BackendBlockTraceQL/resourceAttValNoMatch-16                6.955Mi ± 1%   6.976Mi ± 1%        ~ (p=0.393 n=10)
BackendBlockTraceQL/resourceAttIntrinsicMatch-16           10.736Mi ± 1%   8.526Mi ± 2%  -20.58% (p=0.000 n=10)
BackendBlockTraceQL/mixedValNoMatch-16                      8.436Mi ± 6%   8.655Mi ± 7%        ~ (p=0.684 n=10)
BackendBlockTraceQL/mixedValMixedMatchAnd-16                7.014Mi ± 1%   6.970Mi ± 0%        ~ (p=0.052 n=10)
BackendBlockTraceQL/mixedValMixedMatchOr-16                 7.886Mi ± 8%   7.882Mi ± 8%        ~ (p=0.853 n=10)
BackendBlockTraceQL/count-16                                235.9Mi ± 1%   261.0Mi ± 1%  +10.63% (p=0.000 n=10)
BackendBlockTraceQL/struct-16                               27.72Mi ± 8%   28.15Mi ± 9%   +1.54% (p=0.050 n=10)
BackendBlockTraceQL/||-16                                   26.82Mi ± 3%   16.02Mi ± 1%  -40.26% (p=0.000 n=10)
BackendBlockTraceQL/mixed-16                                7.540Mi ± 4%   7.419Mi ± 3%        ~ (p=0.089 n=10)
geomean                                                     13.45Mi        12.70Mi        -5.57%

                                                 │ bench-maps-old-02.txt │        bench-maps-new-02.txt        │
                                                 │       allocs/op       │  allocs/op   vs base                │
BackendBlockTraceQL/spanAttValNoMatch-16                     109.5k ± 0%   109.5k ± 0%   -0.00% (p=0.000 n=10)
BackendBlockTraceQL/spanAttIntrinsicNoMatch-16               109.4k ± 0%   109.4k ± 0%   -0.00% (p=0.000 n=10)
BackendBlockTraceQL/resourceAttValNoMatch-16                 109.4k ± 0%   109.4k ± 0%   -0.00% (p=0.000 n=10)
BackendBlockTraceQL/resourceAttIntrinsicMatch-16             118.8k ± 0%   116.2k ± 0%   -2.18% (p=0.000 n=10)
BackendBlockTraceQL/mixedValNoMatch-16                       111.8k ± 0%   111.8k ± 0%        ~ (p=0.339 n=10)
BackendBlockTraceQL/mixedValMixedMatchAnd-16                 109.5k ± 0%   109.5k ± 0%   -0.00% (p=0.000 n=10)
BackendBlockTraceQL/mixedValMixedMatchOr-16                  110.8k ± 0%   110.8k ± 0%        ~ (p=0.385 n=10)
BackendBlockTraceQL/count-16                                 2.433M ± 0%   1.513M ± 0%  -37.82% (p=0.000 n=10)
BackendBlockTraceQL/struct-16                                331.0k ± 0%   331.5k ± 1%   +0.16% (p=0.000 n=10)
BackendBlockTraceQL/||-16                                    191.2k ± 0%   181.3k ± 0%   -5.13% (p=0.000 n=10)
BackendBlockTraceQL/mixed-16                                 111.2k ± 0%   111.2k ± 0%   -0.02% (p=0.000 n=10)
geomean                                                      170.8k        162.5k        -4.87%

Signed-off-by: Joe Elliott <[email protected]>
Signed-off-by: Joe Elliott <[email protected]>
Signed-off-by: Joe Elliott <[email protected]>
Signed-off-by: Joe Elliott <[email protected]>
@joe-elliott joe-elliott merged commit 548d693 into grafana:main Nov 17, 2023
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants