Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds PlotHistogram and PlotHistogram2D #148

Merged
merged 24 commits into from
Mar 17, 2021
Merged

Adds PlotHistogram and PlotHistogram2D #148

merged 24 commits into from
Mar 17, 2021

Conversation

epezent
Copy link
Owner

@epezent epezent commented Nov 16, 2020

No description provided.

@epezent
Copy link
Owner Author

epezent commented Nov 16, 2020

@marcizhu and @ozlb , this is a WIP inspired by your recent posts. I'm opening this up for your feedback now, but I'll make a more complete post when I get things closed to finished. One thing to note is that 2D histograms are currently upside down since I'm just piping them through heatmaps, so I need to add a bool to reverse the direction. I'm trying to include common functionality found in MATLAB and matplotlib without going overboard (e.g. density normalization, custom ranges, cumulation, etc).

@marcizhu, I was playing around with bin sizes, and couldn't hit the 1024x256 bins @ 60FPS you spoke of. Hoping you can take a look and post your code for comparison.

// Plots a horizontal histogram. If cumulative is true, each bin contains its count plus the counts of all previous bins. If density is true, the PDF is visualized. If both are true, the CDF is visualized. 
// If range is left unspecified, the min/max of values will be used as the range. Values outside of range are not binned and are ignored in the total count for density=true histograms.
void PlotHistogram(const char* label_id, const T* values, int count, int bins, bool cumulative=false, bool density=false, ImPlotRange range=ImPlotRange(), double bar_scale=1.0);

// Plots two dimensional, bivariate histogram as a heatmap. If density is true, the PDF is visualized. If range is left unspecified, 
// the min/max of xs an ys will be used as the ranges. Values outside of range are not binned and are ignored in the total count for density=true histograms.
void PlotHistogram2D(const char* label_id, const T* xs, const T* ys, int count, int x_bins, int y_bins, bool density=false, ImPlotLimits range=ImPlotLimits());

image

@marcizhu
Copy link

@epezent I haven't looked too deep into the code, but it looks like you're regenerating the signal on every frame. My code just generated the signal once and only added a normal distribution noise on top of it, which is a lot faster (but consumes more memory). My guess is that this is what's causing the different performance, since I had the same issue when I tried generating the signal every frame. I guess sin() is not the fastest function in the world :D

But leaving performance aside, I think this is a great addition to this library! I only have one question: is it possible to leave blank the number of bins and let the library figure it out by itself? I usually work with R and their histogram function automatically calculates the bin sizes using Sturge's Law, which is quite simple and it only depends on the size of the input vector. I think this might be a great (and easy) addition to this PR 😄

Thoughts?

@ozlb
Copy link
Contributor

ozlb commented Nov 16, 2020

I had slightly different approach, but probably your results are better.
I used PlotScatter with square markers with alpha and i do a single pass iteration of samples selection to calculate pdf using Variance Welford’s method.
I preferred this solution instead of bars to “feel” better the density.
I think it’s very good; in terms of performance I couldn’t test yet new API

@epezent
Copy link
Owner Author

epezent commented Nov 16, 2020

but it looks like you're regenerating the signal on every frame

Only the theoretical curve. The gaussian distributions are static. I suppose this could be it, but I doubt it. I just need to run the my profiler and investigate.

I only have one question: is it possible to leave blank the number of bins and let the library figure it out by itself?

I added this. Set bins equal to one of these for auto binning. I was going to add Freedman–Diaconis' choice as it seems common, but it requires a quicksort for determining the IQR. I didn't have time for that this morning.

// Enums for different automatic histogram binning methods (k = bin count or w = bin width)
enum ImPlotBinMethod_ {
    ImPlotBinMethod_Sqrt    = -1, // k = sqrt(n)
    ImPlotBinMethod_Sturges = -2, // k = 1 + log2(n)
    ImPlotBinMethod_Rice    = -3, // k = 2 * cbrt(n)
    ImPlotBinMethod_Scott   = -4, // w = 3.49 * sigma / cbrt(n)
};

@marcizhu
Copy link

I added this. Set bins equal to one of these for auto binning. I was going to add Freedman–Diaconis' choice as it seems common, but it requires a quicksort for determining the IQR. I didn't have time for that this morning.

@epezent Awesome! I think this is a great addition to the library. 10/10 👏 👏

Thanks for adding this! :D

@ozlb
Copy link
Contributor

ozlb commented Nov 17, 2020

The gaussian distributions are static. I suppose this could be it, but I doubt it. I just need to run the my profiler and investigate.

I don't know if can help, but for sure you need to take in consideration that PlotHistogram() is doing probably too many loops over the array values (in worst case scenario 5 times):

  • 2 if given range is 0.0

  • ImMinArray(values, count);

  • ImMaxArray(values, count);

  • 2 if (bins < 0) and ImPlotBinMethod_Scott,

  • ImStdDev() + -> ImMean()

  • 1 always to accumulate over bins

This is the reason why I used Welford's method, to compute the variance in a single pass; here below a sketch of function to give you the idea (I extrapolate because is doing also other stuff and using external vars, sorry about that...)

void OnlineStat(TDataSample xs, int count, double& xyMin, double& xyMax, double& ySum, double& varM, double& varS, double& yAvg, double& yAvg, double &yVariance) {
	if (count < 2) return xs.y;
	double oldM = varM;
	ySum += xs.y;
	varM += (xs.y - varM) / count;
	varS += (xs.y - varM)*(xs.y - oldM);
	if (xyMin.y >= xs.y)//= to get first minimum in the list
		xyMin = TDataSample(xs.x, xs.y);
	if (xyMax.y < xs.y)
		xyMax = TDataSample(xs.x, xs.y);
	yAvg = ySum / cnt;
	//Variance Welford's method
	yVariance = (varS / (count - 1));
}

//PDF (probability density function)
static double PDF(double y, double avg, double variance) {
	const double inv_sqrt_2pi = 0.3989422804014327;
	double a = (y - avg) / variance;
	return (inv_sqrt_2pi / variance * exp(-(double)(0.5) * a * a));
}

Inspired by
https://github.com/dizcza/OnlineMean
https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Welford's_Online_algorithm

@marcizhu
Copy link

@marcizhu, I was playing around with bin sizes, and couldn't hit the 1024x256 bins @ 60FPS you spoke of. Hoping you can take a look and post your code for comparison.

@epezent This is the code that I use to get the sample that I sent a few days ago to #94:

void renderHistogram(float* sinewave, float* values, const std::normal_distribution<float>& distribution)
{
    ImGui::Begin("Test histogram", NULL, ImGuiWindowFlags_NoCollapse | ImGuiWindowFlags_NoTitleBar | ImGuiWindowFlags_NoResize);

    for(int i = 0; i < 1024; i++)
    {
        short val = (short)(sineWave[i] + distribution(generator));
        if(val >= 0 && val <= 256)
        {
            values[val * 1024 + i] += 1.0f;
            max_value = fmax(values[val * 1024 + i], max_value);
        }
    }

    static ImPlotAxisFlags axes_flags = ImPlotAxisFlags_LockMin | ImPlotAxisFlags_LockMax;
    ImPlot::PushColormap(ImPlotColormap_Jet);
    ImPlot::SetNextPlotLimits(0,1024, 0,256);
    if(ImPlot::BeginPlot("##HeatmapTest", NULL, NULL, ImVec2(-1, -1), 0, axes_flags, axes_flags))
    {
        ImPlot::PlotHeatmap("heatTest", values, 256, 1024, 0.0f, max_value, NULL, ImPlotPoint(0,0), ImPlotPoint(1024,256));
        ImPlot::EndPlot();
    }
    ImPlot::PopColormap();

    ImGui::End();
}

Where values is a float array of size 1024*256 initialized to 0 and sineWave is a float[1024] array initialized to hold the shown sine wave, like this:

float sineWave[1024]{};
for(int i = 0; i < 1024; i++)
    sineWave[i] += (110.0f * sin(2.0f * M_PI * (float)i / 512.0f) + 128.0f);

This is the performance that I'm getting on my MacBook 2018 (Intel Core i5, 2.3 GHz; compiled using AppleClang 12 with -O2) using the histogram branch of implot:

image

As you can see, I get consistently 60 fps. It occasionally drops down to something like 59.7 but it rapidly goes up to 60. Also, I have the implot demo rendering behind this ImGui window and multiple applications opened, so I don't see why 60 fps would not be achievable. If you're not getting this performance it might be due to what @ozlb suggested, maybe the library is doing too many passes over the data. Or maybe you're compiling using less optimizations like -O1 or something? I certainly cannot get this performance without optimizations or even using -O1 on gcc/clang.

Anyway, I hope this was somewhat helpful. If you need me to run any benchmarks, tests or anything, just let me know :)

@epezent
Copy link
Owner Author

epezent commented Nov 18, 2020

@marcizhu , I get similar performance with your code on my laptop, which has a GTX 2080, so this is obviously CPU bound. A quick profile revealed the main culprits to be LerpColormap() and ImGui::GetColorU32(). I think we could get a speed up by storing colormaps as ImU32 and using clever interpolation algorithms. This is something I've wanted to look into for a while, I'm just not sure how we preserve the ImVec4* versions of PushColormap and SetColormap though.

@ozlb , yes there's a lot of looping in this implemenation. At the very least, I can merge ImMinArray and ImMaxArray into one loop.

@marcizhu
Copy link

@epezent Hmmm... I see no easy way to preserve the ImVec4* versions of PushColormap() and SetColormap(). The only thing I came up with is to convert the ImVec4* to Im32U* on the first call to those functions and store that new array internally using some hashmap/lookup table or something like that. That way the first call might be a bit expensive due to the new memory allocation and the array conversion but the next calls are just O(1) like they are now.

To be honest, I wouldn't try to "save" the ImVec4* variants of PushColormap() and SetColormap(). The library is still in development and API breaking changes are expected. Also, this change would only affect those who use custom colormaps, and it is not even that big of a deal. Just convert the colormap colors from float[4] to int and you're ready to go. We could even provide some constexpr functions to create the Im32U color from 4 floats at compile time, aiding in this small migration to the new API. Something like this:

constexpr Im32U ColorIm32(float a, float r, float g, float b)
{
    Im32U color =
        (((Im32U)(a * 255.0f + 0.5f)) << 24) |
        (((Im32U)(r * 255.0f + 0.5f)) << 16) |
        (((Im32U)(g * 255.0f + 0.5f)) <<  8) |
        (((Im32U)(b * 255.0f + 0.5f)) <<  0);

    return color;
}

@epezent epezent changed the title Adds ImPlotHistogram and ImPlotHistogram2D Adds PlotHistogram and PlotHistogram2D Nov 24, 2020
@epezent epezent added this to the v1.0 milestone Jan 18, 2021
@epezent
Copy link
Owner Author

epezent commented Mar 9, 2021

I've been working on this the past few days...finally wrapped up my PhD, so I have more free time now :)

  • Colormaps are now stored as ImU32 colors instead of ImVec4. As such, color interpolation is done using bit-shifting magic. This alone provides a 50% boost in performance for large heatmaps. The compile time option IMPLOT_MIX64 (see implot_internal.h) can be enabled to use 64-bit multiplications for another 10% gain on 64-bit machines.
  • A second compile time option, IMPLOT_USE_COLORMAP_TABLES, can be enabled to bypass runtime interpolation altogether and look up colors from a precomputed table. The memory overhead for this is about 100 kB for the built-in maps, but offers another 10% performance boost.
  • The public API for adding/setting colormaps has been changed accordingly. First, SetColormap(ImPlotColormap), SetColormap(ImVec4*) and PushColorMap(ImVec4*) were removed. To add custom colormaps, users now call AddColormap on startup. This takes a string name and either an ImVec4 array or ImU32 array. To use the map, users use PushColormap with either the returned colormap index or their string name. Built-in colormaps can be pushed similarly.
  • I finished PlotHistogram and PlotHistogram2D. For the latter, I made several small optimizations to the PlotHeatmap implementation, including culling. I haven't yet attempted to make optimizations to the binning algorithms. I will still investigate reducing the loop count as @ozlb suggested.

@marcizhu, using all of the optimizations above I am able to render your example code above at 105 FPS vs the previous 60 FPS. I believe there may still be some gains to be had, but overall I'm happy with this performance. I would appreciate it if you could pull this PR and report back with any gains you see and/or bugs you find. Please try different combinations of the two compile time options I mentioned.

If both of you could provide feedback on the API and signatures for PlotHistogram and PlotHistogram2D, that would also be helpful.

@epezent
Copy link
Owner Author

epezent commented Mar 10, 2021

Looping in @jvannugteren. The changes to heatmap and colormaps are relevant to your application. If you'd be willing to test this PR, it'd be much appreciated!

@leeoniya
Copy link

I've been working on this the past few days...finally wrapped up my PhD

congrats! 💯

@marcizhu
Copy link

First of all, apologies for my late reply. Due to timezone differences and university I was unable to do the performance tests until now.

I've been working on this the past few days...finally wrapped up my PhD

Congrats!! 😄

@marcizhu [...] I would appreciate it if you could pull this PR and report back with any gains you see and/or bugs you find. Please try different combinations of the two compile time options I mentioned.

Sure thing! I've done some performance tests and using both flags. Here's a small table with my results (as always, ymmv):

IMPLOT_MIX64 flag IMPLOT_USE_COLORMAP_TABLES flag Performance
85-87 FPS
87-88 FPS
99-102 FPS
100-105 FPS

I am more than happy with these changes. The performance has greatly improved and now ImPlot isn't CPU-bound as it used to be (the macOS performance monitor shows a 70% CPU and around 30% GPU usage for the small example instead of the previous 100% CPU and 10% GPU). I'm sure I could optimize my code a lot more and have even better performance but still, over 100 FPS is an impressive result if we take into account the incredibly big size of the heatmap (1024x256, over a quarter of a million squares redrawn 100 times per second!).

If both of you could provide feedback on the API and signatures for PlotHistogram and PlotHistogram2D, that would also be helpful.

The new API looks great to me: it's easy to use, yet it's powerful and really flexible. I can't suggest any changes, it looks pretty much perfect to me! 😄

@epezent
Copy link
Owner Author

epezent commented Mar 10, 2021

Thanks Marc! Thorough as always. Glad to see the performance gains here.

I was leaning toward keeping IMPLOT_USE_COLORMAP_TABLES disabled by default, but I'm thinking now that we might want to make it standard. The only downside I see is the memory footprint of the tables. The size of each table is exactly (N-1)*255+1 where N is the number of colors in the colormap. We have 11 colormaps with about 10 colors each, which comes out to 26,786 ImU32 we need to compute and store on startup. The memory footprint for this is 107 kB, which is about the same size as a TTF font file. Of course this grows with each user-added colormap, but I don't think a few hundred kB is too much to ask for. Thoughts?

@marcizhu
Copy link

Personally, I think that ~100kB is not much nowadays. Since this project is not aimed at embedded systems, where resources like RAM might be limited, we could easily leave the flag enabled by default so that the library by default offers its best performance unless the user decides to reduce memory footprint in exchange for a bit worse performance.

@jvannugteren
Copy link

I've tested the PR for my application and encountered no problems with both compile flags enabled. Also got pretty good performance increase.

@jvannugteren
Copy link

Today I made an interesting mistake, probably during copy pasting, which took me a couple of hours to find. I was experiencing vague segfaults and "double linked list corruption", which occured after leaving the application open for a while. In the end it came down to the following lines of code:

ImPlot::PushColormap(ImPlotColormap_Paired);
// plot some stuff
ImPlot::PopColormap(ImPlotColormap_Paired); // this is not correct

The ImPlotColormap_Paired input in the PopColormap is obviously the problem. This compiles fine as PopColormap takes an integer as input. However, the result is that the colormap attempted to be popped 3 times. As a suggestion maybe you can add a user assert to see if the size of gp.ColormapModifiers is going to fall below zero.

@epezent
Copy link
Owner Author

epezent commented Mar 11, 2021

Probably not a bad idea. I don't believe ImGui does this check for its pops, but there's no reason we cant.

@epezent epezent merged commit 1d9381a into master Mar 17, 2021
@epezent epezent deleted the histogram branch April 1, 2021 00:30
Ben1138 pushed a commit to Ben1138/implot that referenced this pull request Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants