Combined TPCH runs & uniformed summaries for benchmarks #4128

isidentical · 2022-11-07T14:39:15Z

Which issue does this PR close?

Closes #4127.

Rationale for this change

This PR adds support for executing TPCH benchmarks without a --query. When there is no --query, all the queries (from 1 to 22) is executed and the execution information regarding them are saved.

Summarry for tpch --query=1
Summary for tpch

Are there any user-facing changes?

The TPCH benchmark output format is different now.

isidentical · 2022-11-07T17:23:15Z

While playing with this, I've also written a little Python script to function like a benchmark comparison UI (poor man's conbench): https://gist.github.com/isidentical/4e3fff1350e9d49672e15d54d9e8299f

It is quite basic, but I think it can automate a few stuff for https://github.com/datafusion-contrib/benchmark-automation/tree/main. E.g. an example comparison between (--disable-statistics vs --enabble-statistics)

 $ ./target/release/tpch benchmark datafusion --path /opt/data-parquet --format parquet --iterations 3 -o /tmp/benchmarks --disable-statistics
 $ ./target/release/tpch benchmark datafusion --path /opt/data-parquet --format parquet --iterations 3 -o /tmp/benchmarks
 $ python t.py compare /tmp/benchmarks/file1.json /tmp/benchmarks/file2.json
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃     Baseline ┃   Comparison ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ Q1           │     702.18ms │     687.86ms │     no change │
│ Q2           │     413.74ms │     302.22ms │ +1.37x faster │
│ Q3           │     392.94ms │     395.34ms │     no change │
│ Q4           │     111.28ms │      97.01ms │ +1.15x faster │
│ Q5           │     465.81ms │     487.92ms │     no change │
│ Q6           │     402.94ms │     402.48ms │     no change │
│ Q7           │     868.18ms │     889.51ms │     no change │
│ Q8           │     499.98ms │     468.68ms │ +1.07x faster │
│ Q9           │     827.54ms │     837.67ms │     no change │
│ Q10          │     503.22ms │     492.29ms │     no change │
│ Q11          │     221.30ms │     167.37ms │ +1.32x faster │
│ Q12          │     204.10ms │     170.99ms │ +1.19x faster │
│ Q13          │     441.50ms │     423.67ms │     no change │
│ Q14          │     373.42ms │     383.57ms │     no change │
│ Q15          │     356.24ms │     352.67ms │     no change │
│ Q16          │     115.38ms │     117.98ms │     no change │
│ Q17          │    2099.22ms │    2209.00ms │  1.05x slower │
│ Q18          │    1255.95ms │    1285.39ms │     no change │
│ Q19          │     656.93ms │     660.46ms │     no change │
│ Q20          │     640.30ms │     624.94ms │     no change │
│ Q21          │     697.55ms │     685.22ms │     no change │
│ Q22          │      84.20ms │      81.76ms │     no change │
└──────────────┴──────────────┴──────────────┴───────────────┘

alamb

Thanks @isidentical !

I don't know how much you want to test the benchmark run code (it might be easier to just deal with any breakages than trying to prevent regressions through tests)

benchmarks/README.md

benchmarks/src/bin/tpch.rs

alamb · 2022-11-07T20:35:19Z

benchmarks/src/bin/tpch.rs

    println!("Running benchmarks with the following options: {:?}", opt);
-    let mut benchmark_run = BenchmarkRun::new(opt.query);
+    let query_range = match opt.query {


isidentical · 2022-11-08T23:16:28Z

I don't know how much you want to test the benchmark run code (it might be easier to just deal with any breakages than trying to prevent regressions through tests)

Yeah, I tried to take a look at it but it seems like it would take a bit too much effort for a relatively simple feature. I guess we'll probably notice if something got broken when we have an automated benchmark system 😄

ursabot · 2022-11-09T00:32:06Z

Benchmark runs are scheduled for baseline = b58ec81 and contender = a32fb65. a32fb65 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

isidentical force-pushed the gh-4127 branch 4 times, most recently from 5fdc4a8 to eb12555 Compare November 7, 2022 16:52

isidentical marked this pull request as ready for review November 7, 2022 17:23

isidentical force-pushed the gh-4127 branch from eb12555 to d70f599 Compare November 7, 2022 17:29

alamb approved these changes Nov 7, 2022

View reviewed changes

andygrove approved these changes Nov 7, 2022

View reviewed changes

Combined TPCH runs & uniformed summaries for benchmarks

1661203

isidentical force-pushed the gh-4127 branch from d70f599 to 1661203 Compare November 8, 2022 23:15

andygrove approved these changes Nov 8, 2022

View reviewed changes

andygrove merged commit a32fb65 into apache:master Nov 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Combined TPCH runs & uniformed summaries for benchmarks #4128

Combined TPCH runs & uniformed summaries for benchmarks #4128

isidentical commented Nov 7, 2022

isidentical commented Nov 7, 2022 •

edited

Loading

alamb left a comment

alamb Nov 7, 2022

isidentical commented Nov 8, 2022

ursabot commented Nov 9, 2022

Combined TPCH runs & uniformed summaries for benchmarks #4128

Combined TPCH runs & uniformed summaries for benchmarks #4128

Conversation

isidentical commented Nov 7, 2022

Which issue does this PR close?

Rationale for this change

Are there any user-facing changes?

isidentical commented Nov 7, 2022 • edited Loading

alamb left a comment

Choose a reason for hiding this comment

alamb Nov 7, 2022

Choose a reason for hiding this comment

isidentical commented Nov 8, 2022

ursabot commented Nov 9, 2022

isidentical commented Nov 7, 2022 •

edited

Loading