Micro benchmarks #9

fpetrogalli · 2017-03-13T11:55:46Z

Use the google benchmark library [1] to write micro-benchmarks for the vector math routines.

Each benchmark should be invoking the function with random values in the input range.

[1] https://github.com/google/benchmark

shibatch · 2017-10-23T09:46:13Z

I am now considering to overhaul the benchmarking tools.
My plan is to automate graphing with gnuplot first, and then introduce Google's framework.
In order to process data, I am planning to use java, since I am used to it.

fpetrogalli · 2017-10-23T09:56:10Z

I think we should do this after the transition from Makefile to cmake, as all the components will be easier to integrate.

I think that it would be better to implement this in the following order:

write the micro benchmarks with the google benchmark framework;
produce the graphic visualization.

I am happy for you to use Java+gnuplot, but I think that it would be better to use python+mathplotlib, as it might be easier to maintain and run. It will require less configuration than setting up a Java VM, and I think it will be easier to deploy on Travis.

shibatch · 2017-10-23T10:07:25Z

Do we need to deploy it on Travis?
You know, the benchmark results cannot be trusted since Travis is on a cloud.
The problem with python is that I need time to learn those things.
I have lots of experience with Java, and some experience with gnuplot, so it would be much easier for me.

fpetrogalli · 2017-10-23T10:13:05Z

Then, Java + gnuplot is, you are right, we cannot run benchmarks on the cloud.

I still think you should first use the micro benchmark system of google and then plot the data, because the output of Google benchmark might be different from what you expect in the plotting scripts. You might need to rework the scripts if that is going to be the case. Also, Google benchmark has some report facilities (see [1] for example), which might be used instead of the plotting tool.

I recommend you to leave the scripts for plotting as the last (optional) part of this work.

[1] http://1.bp.blogspot.com/-wk7hsdYodo8/UtS75FZag6I/AAAAAAAAAks/cQFiCXPbtwk/s1600/image00.png

shibatch · 2017-10-23T10:24:15Z

Could you explain a little bit about how you are planning to use the output data?
I have been always thought this work as a part of writing our paper.
From that perspective, drawing graph is essential.

fpetrogalli · 2017-10-23T15:19:43Z

Getting numbers is very important. Using google framework should guarantee that we have reliable evaluations. If the aim of setting up this is only to get the numbers in the paper, I think we don't need to store any script in here for that.

My goal here was to make sure that the numbers we were getting where reliable, and I believe that those we could get with google micro benchmarks framework have such property. They would be reliable not only for us, but also for people reading these numbers are they are produce via a standard tool.

Also, using google micro benchmarks would make easier for other people to verify our claims on their own machines.

For the paper, maybe we could store the scripts that generate the graphics in the paper itself?

shibatch · 2017-10-23T16:17:06Z

I still don't understand why there could be so much difference in reliability of measured values. We are just measuring execution time of small C codes, which basically don't have conditional branch or memory access. Execution time is highly reproducible. If this were java, there would be many things to consider like JIT or garbage collection.
How much reliability do you need?

…uplot. With this new tool, the graph showing execution time can be automatically drawn. It is easy to see the difference between different versions or compilers. Some results are shown at http://shibatch.github.io/ This patch implements the tool described in #9. I am planning to add another benchmarking tool based on the Google microbenchmarking framework.

This patch replaces the old benchmarking tool with a new tool with gnuplot. With this new tool, the graph showing execution time can be automatically drawn. It is now easy to see the difference between different library versions or compilers. This tool is used to draw the graphs on sleef.org web site. This patch implements the tool described in #9. A makefile is used In this patch. It is used to compile the necessary tools. This makefile is used as a script for measurement. It is handy since ProcData.java has to be also compiled.

joanaxcruz · 2024-10-10T10:27:52Z

Reviving discussion in this issue, as new changes are predicted to come in place in order to close it.

Above it's discussed introducing googlebench framework. I decided to investigate a bit on this, and the following points made it appealing to the project:

active repo - https://github.com/google/benchmark
widely used in other projects
reduced maintenance cost - this would fall on the people maintaining the framework
straightforward integration with CMake
easy to use
comes with a lot of free functionality with it (like filtering the benchmarks to only show performance of a particular function, and past the results in json or csv format).
reliable results (results don't vary between runs)

In order to close this issue I propose the following plan:

Create a new benchmark tool using googlebench (making sure all functionality from current tool is migrated to new tool)
Treat the results and display results using python and plotly (interactive graphs)

Future work not necessary to close this issue:

Create a GHA benchmarking pipeline

blapie · 2024-10-10T15:55:40Z

Looking very good! Looking forward to seeing new results.

Create a GHA benchmarking pipeline

I don't know how useful it would be to run in GHA, these are shared resources so results might be polluted. Last time I googled it, it seemed that benchmarks in GHA were not that common. But maybe in practice figures can still be useful.
We should definitely at least try to build benchmarks in pre/post-commit though.

shibatch mentioned this issue Oct 27, 2017

[Benchmark] New benchmarking tool with gnuplot #98

Closed

shibatch mentioned this issue Feb 6, 2018

[Benchmark] New benchmarking tool with gnuplot(2) #168

Merged

blapie added the perf label Nov 9, 2023

joanaxcruz mentioned this issue Aug 28, 2024

Revive micro-benchmarks for vector functions #571

Merged

joanaxcruz mentioned this issue Oct 4, 2024

Add a new tester that utilizes tlfloat #574

Open

joanaxcruz mentioned this issue Oct 10, 2024

Integrate Google benchmarks into SLEEF #589

Merged

joanaxcruz mentioned this issue Oct 18, 2024

Delete old benchmark system #597

Merged

blapie linked a pull request Oct 18, 2024 that will close this issue

Delete old benchmark system #597

Merged

blapie closed this as completed in #597 Oct 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Micro benchmarks #9

Micro benchmarks #9

fpetrogalli commented Mar 13, 2017

shibatch commented Oct 23, 2017

fpetrogalli commented Oct 23, 2017

shibatch commented Oct 23, 2017

fpetrogalli commented Oct 23, 2017 •

edited

Loading

shibatch commented Oct 23, 2017 •

edited

Loading

fpetrogalli commented Oct 23, 2017

shibatch commented Oct 23, 2017

joanaxcruz commented Oct 10, 2024 •

edited

Loading

blapie commented Oct 10, 2024

Micro benchmarks #9

Micro benchmarks #9

Comments

fpetrogalli commented Mar 13, 2017

shibatch commented Oct 23, 2017

fpetrogalli commented Oct 23, 2017

shibatch commented Oct 23, 2017

fpetrogalli commented Oct 23, 2017 • edited Loading

shibatch commented Oct 23, 2017 • edited Loading

fpetrogalli commented Oct 23, 2017

shibatch commented Oct 23, 2017

joanaxcruz commented Oct 10, 2024 • edited Loading

blapie commented Oct 10, 2024

fpetrogalli commented Oct 23, 2017 •

edited

Loading

shibatch commented Oct 23, 2017 •

edited

Loading

joanaxcruz commented Oct 10, 2024 •

edited

Loading