Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing default HoloViews colormap #3500

Closed
ahuang11 opened this issue Feb 16, 2019 · 19 comments
Closed

Changing default HoloViews colormap #3500

ahuang11 opened this issue Feb 16, 2019 · 19 comments

Comments

@ahuang11
Copy link
Collaborator

Since the default cmap's highest value is white, it's hard to tell if the white values are the max or NaN without hover (and most colormaps used do not use gray?)

image

@poplarShift
Copy link
Collaborator

+1, this makes sense to me. Either that or change the default cmap?

@philippjfr philippjfr added this to the v1.12.0 milestone Feb 20, 2019
@jbednar
Copy link
Member

jbednar commented Feb 20, 2019

I thought we already had an issue about various problems with the default colormap, but I was unable to find it by searching for "colormap" or "default" or "hot". (ETA: I think it was #2487.) From what I remember, there were many reasons to change HoloViews' original default "hot" colormap, given that the default page color is typically white for Jupyter notebooks:

  1. "hot" is highly nonuniform perceptually as described at colorcet.pyviz.org.
  2. It impossible to distinguish a NaN area from a peak value, because the peak value of the colormap is the same as the background color.
  3. The highest data values are those most similar to the page color, when for intuitive understanding of a sequentially ordered set of numeric values the highest values should be maximally different from the page color. (That way, isolated patches will be more visible the higher their value, but currently they have less contrast as their values increase.)
  4. (Less crucial) The lowest value in the colormap is black, which can be ambiguous with other data elements in the plot that default to black.

Possible improvements:

  • Since HoloViews 1.7 we have been using "fire" instead, which addresses issue 1, but it still has the other three problems. To illustrate problems 2 and 3, with "fire" on a white background you get sharp perceptual discontinuities between low values and NaN values; it only really works on a dark background:
    image
    image
    (but note that the top plot also illustrates problem 2, i.e. that NaN is not distinguishable).
  • The linear_kry_5_98_c75 colormap in colorcet is a version of fire that doesn't top out at white, avoiding problem 2, but it still has problems 3 and 4.
  • A reversed version of linear_kry_5_98_c75 would avoid problems 1, 2, and 3, but it still has 4 (now at the top end) and also is a bit odd, as it uses colors from black-body radiation (like "hot" and "fire") but in an unintuitive order
  • We could use viridis as matplotlib does now, solving problems 1, 2, and 4, but it still suffers from 3 in that isolated patches of high-value pixels are very similar to a white page color, instead of showing up very strongly as they should. Parula, gray, magma, inferno, and plasma all suffer from the same problem; people choosing these as good defaults seem not to have considered the issue of isolated pixels and small patches surrounded by NaNs. We can of course reverse any of these colormaps to solve problem 4, but that would be confusing to anyone who is used to seeing them in their native ordering, causing a new problem 5.

Colormaps bgy_r, bmy_r, dimgray_r, kbc_r, and kgy_r (all reversed) from colorcet would meet all the criteria above, but I don't know if anyone would be happy with those as a default.

@ahuang11
Copy link
Collaborator Author

I like RdYlBu_r, but it probably has its own issues with that being the default

@jbednar
Copy link
Member

jbednar commented Feb 21, 2019

RdYlBu is a diverging colormap (see http://holoviews.org/user_guide/Colormaps.html), which is useful for plots with a well defined central value (e.g. deviations around a mean, or positive and negative values). Those conditions occur often, but much more often a sequential colormap is appropriate, so I think the default needs to be sequential.

@poplarShift
Copy link
Collaborator

Thanks for this detailed post @jbednar ! Personally I like bgy_r, it is rather similar to viridis in overall appearance.

@jbednar
Copy link
Member

jbednar commented Jul 17, 2019

I put together a simple notebook with examples of data plotted against the default white background, which is what matters for the default colormap: https://anaconda.org/jbednar/hv_colormap/notebook

In each one, notice how some data disappears if the colormap includes white (e.g. from the heatmaps in the bottom left of each figure, for fire and gray maps), and also notice how the peak data is more intuitive and perceptible if the largest value color is most different from white.

For each one, I've listed which of the above problems 1-5 it has, out of 1-nonuniform, 2-nan=peak, 3-invertedsalience, 4-blackdata, 5-invertedfromfamiliar

viridis: 3-invertedsalience cmap_viridis
viridis_r: 5-invertedfromfamiliar cmap_viridis_r
bgy: 3-invertedsalience cmap_bgy
bgy_r: 5-invertedfromfamiliar cmap_bgy_r
fire: 2-nan=peak, 3-invertedsalience, 4-blackdata cmap_fire
fire_r: 4-blackdata, 5-invertedfromfamiliar cmap_fire_r
cet_linear_kry_5_98_c75: 3-invertedsalience, 4-blackdata cmap_cet_linear_kry_5_98_c75
cet_linear_kry_5_98_c75_r: 4-blackdata, 5-invertedfromfamiliar cmap_cet_linear_kry_5_98_c75_r
gray: 2-nan=peak, 3-invertedsalience, 4-blackdata cmap_gray
gray_r: 4-blackdata cmap_gray_r
kbc: 3-invertedsalience, 4-blackdata cmap_kbc
kbc_r: 4-blackdata cmap_kbc_r

@jbednar
Copy link
Member

jbednar commented Jul 17, 2019

To me, viridis and viridis-like colors are problematic because matplotlib defaults to the non-reversed version, where the largest values are those most similar to the background. Because people expect viridis to be in this order, even though it's incorrect for a white background as can be seen above (e.g. in the hextiles), I think we should choose a different one. Same goes for fire and related maps -- people expect them to be in the non-reversed order, which is even worse for fire because the non-reversed one actually represents the color sequence from black-body radiation, and thus reversing it is very odd (with higher values corresponding to lower-temperature colors, confusingly). Gray can be in any order, but it's always going to overlap in color with plot elements (axes, annotations, etc.) unless we limit to a very small range of medium grays, which will limit dynamic range.

So my own vote, grudgingly, is for kbc_r, which fits all the criteria but #4, because I don't think it has any natural preferred order that people expect. For #4, it actually doesn't quite include black; the darkest color is the very dark blue "#00004e", which is (barely) distinguishable from black:

image

If we really want true black to be fully distinguishable from the highest value, we could lose the top couple of color values, though that would be a bit confusing because it would no longer be the standard 256 colors in length.

@jbednar
Copy link
Member

jbednar commented Jul 17, 2019

BTW, I just watched a live talk where someone used Viridis in the wrong order, which applies to almost every use of viridis I see. On a white page, the default ordering is only ever correct for plots with no background visible, which happens but is relatively rare.

@poplarShift
Copy link
Collaborator

poplarShift commented Jul 17, 2019

Spacey, I love it!

Anyway, I can confirm; I used bgy_r for almost all figures of a talk a few months back, and some people were very determined I used it the "wrong" way. I'm not sure I'd go as far as calling the standard use of viridis "wrong" either (bright colours close to red do appear more drastic I guess) but your mileage advocating for one or the other may vary...

@jbednar
Copy link
Member

jbednar commented Jul 17, 2019

Right; people's expectations are a big problem. Viridis (like the other 3 main mpl uniform colormaps) is totally in the right order for a black background; the problem is just that Jupyter and PDF backgrounds are nearly always white, giving very unsuitable results.

BTW, anyone can propose another possible colormap here, and I'll run that notebook on it (or you can run it yourself and paste the resulting .png).

@jbednar
Copy link
Member

jbednar commented Jul 17, 2019

Also note that the default annotation color is blue, which doesn't pop out against kbc as much as against fire, but it's still distinguishable and also kind of fits the theme:

image

@philippjfr
Copy link
Member

philippjfr commented Jul 18, 2019

I think fighting against people's expectations is potentially very problematic. It's definitely inarguable that non-reversed colormaps against a white background are problematic because it makes it hard to distinguish them from the background, but I also definitely wouldn't say they are wrong. The possibility that people will completely confuse the upper and lower bound, is imo something that we should not discount and has potentially much greater implications for misinterpreting data. People are used to dark to light color scales, for better or for worse, and I'm still severely hesitant using a default that subverts those expectations.

@jbednar
Copy link
Member

jbednar commented Jul 18, 2019

Somehow all of science seems to have gotten confused, but I am very reluctant to go along with their confusion. I'm embarrassed on behalf of scientists everywhere!

I do very strongly assert that non-reversed colormaps are in fact wrong (incorrect and inappropriate) to use against a white background, because (a) people have to fight their own visual systems to interpret them correctly, and (b) people have a large danger of missing the most salient values, and (c) such plots only work at all because of properties of typical data, which makes them fall down for atypical data and outliers.

Consider kbc vs. kbc_r, for this plot of a 2D normal distribution on a white background:

image

  1. kbc highlights the very low-count cells around the edges, which with a non-reversed colormap are extremely salient to the visual system but of absolutely no scientific interest in normal cases -- that pattern is just the noise floor of this measurement, not the underlying circular pattern you are trying to measure at all! Those noisy patterns completely detract from the actual signal, making even quite different central regions look about the same because they are all always surrounded by noise.
  2. For kbc_r, the center of the distribution, i.e. the highest count regions, are what pop out across the page, which is where most of the data is and the strongest signal is. The light-blue regions are just noise, poorly sampled and thus with a shape that is of very little consequence, and so it is suitable that they are perceptually de-emphasized (but still detectable).
  3. What would happen if there were a bug in this experiment and one particular cell along the margins happened to have highly anomalous values, as strong as the biggest peak of the data? E.g. if -2,-2 had a value of 25?
    • For kbc_r, it is very unlikely that anyone would ever notice the bug, because it would be a very weakly colored spot in the midst of a bunch of very darkly colored spots and white spots. Worse, the more anomalous the result, both in value and in how far out it is, the less salient it would be.
    • For kbc such a bug (or momentous discovery!) would jump out immediately, being much darker both than its neighbors and the surrounding white page. The more anomalous the value, the more salient it would be, as appropriate.
  4. Conversely, what would happen if there were the opposite bug, with missing data (NaN) near the center of this distribution?
    • For kbc_r, almost no one would notice that problem or make that discovery either, because the white pixel would be very similar to its very light blue neighbors. Only someone specifically looking for that would notice it.
    • For kbc, the missing data would jump out immediately, in contrast to the very dark neighbors.

Given all this, why would anyone ever use kbc_r or other similar colormaps against white? Well, (a) because it's the default (currently completely inappropriately but perhaps appropriately back when people used black-background monitors and presentation slides), and (b) they can still see the shape of the high-count regions because of the typically normal distribution of data values -- gradual shift from low counts to high counts, with high counts clustered spatially.

I.e., it's possible to get an understanding of the data out of the "wrong" plot, but it's a very dangerous thing to do, because any of the less-common cases above could occur without anyone realizing it. The point of a visualization is to reveal the data, not to paint a picture of what you already expect based on previous typical examples. If a visualization reveals the typical case ok but is completely misleading in atypical cases, it's a demonstrably incorrect thing to do, if there's another alternative that doesn't have that problem.

As shown above, reversing viridis or fire would fix the problem, but causes another problem with people's now completely trained bogus expectations. I don't think kbc suffers from those expectations; as a less familiar colormap, I think people will look at the colorbar if they are confused.

@jbednar
Copy link
Member

jbednar commented Jul 18, 2019

Turns out it's easy enough to demonstrate problems 3 and 4 directly, by moving all the points near 0,0 by -2, -2:

np.random.seed(44)
a = np.random.randn(5000, 2)
v = 0.09
a[np.logical_and(np.logical_and(a[:,0]>-v, a[:,0]<v),
                 np.logical_and(a[:,1]>-v, a[:,1]<v))] -= 2
hex_tiles = hv.HexTiles(a)

hex_tiles.options(opts.HexTiles(width=300, height=250, colorbar=True, cmap="kbc")) + \
hex_tiles.options(opts.HexTiles(width=300, height=250, colorbar=True, cmap="kbc_r"))

image

Fun game -- try to spot the outlier at -2,-2, which at a count of 25 should be the highest value in this plot. Same goes for the hole at 0,0. Still think it's ok to use kbc or similarly inappropriate maps like viridis and fire, when the white background is visible? With kbc_r you can immediately tell that the distribution has been doctored like this. With kbc, who's to know?

@jbednar
Copy link
Member

jbednar commented Jul 18, 2019

And of course the opposite is true for a black background, with kbc being the one to convey it accurately:

hex_tiles.options(opts.HexTiles(width=300, height=250, bgcolor="black", colorbar=True, cmap="kbc")) + \
hex_tiles.options(opts.HexTiles(width=300, height=250, bgcolor="black", colorbar=True, cmap="kbc_r"))

image

@jbednar
Copy link
Member

jbednar commented Jul 18, 2019

Plus, the problems don't get any better with more data, as there will always be some edge of the data that becomes ambiguous:

np.random.seed(44)
a = np.random.randn(50000, 2)
v = 0.12
a[np.logical_and(np.logical_and(a[:,0]>-v, a[:,0]<v),
                 np.logical_and(a[:,1]>-v, a[:,1]<v))] -= 2.82
hex_tiles = hv.HexTiles(a)

hex_tiles.options(opts.HexTiles(width=300, height=250, colorbar=True, cmap="kbc")) + \
hex_tiles.options(opts.HexTiles(width=300, height=250, colorbar=True, cmap="kbc_r"))

image

hex_tiles.options(opts.HexTiles(width=300, height=250, bgcolor="black", colorbar=True, cmap="kbc")) + \
hex_tiles.options(opts.HexTiles(width=300, height=250, bgcolor="black", colorbar=True, cmap="kbc_r"))

image

@philippjfr philippjfr modified the milestones: v1.13.0, v1.13.x Jan 15, 2020
@jbednar jbednar changed the title Thoughts on defaulting NaN values on rasters to gray? Changing default HoloViews colormap Nov 27, 2020
@jbednar
Copy link
Member

jbednar commented Nov 27, 2020

As discussed in #4567, there are now many reasons to change the default colormap. I'm again convinced by my diatribe above that kbc_r is our best option for a colormap. It's perceptually uniform, appropriate for a white background, is not typically used widely and thus has no expected ordering, is completely different from the default so that people cannot fail to notice the change (compared to just flipping the fire order and deleting white), is compatible with the other default colors in HoloViews, is already the default in hvPlot, and looks pretty good to me.

What does it take to make it the default? Presumably copy kbc to holoviews/plotting/util.py alongside fire, add kbc_r, and update these references?

./plotting/bokeh/__init__.py:178: options.Spikes = Options('style', color='black', cmap='fire', muted_alpha=0.2)
./plotting/bokeh/__init__.py:161: dflt_cmap = 'fire'
./plotting/plotly/__init__.py:103: dflt_cmap = 'fire'
./plotting/mpl/__init__.py:223: dflt_cmap = 'fire'
./plotting/mpl/__init__.py:240: options.Surface = Options('style', cmap='fire')
./plotting/mpl/__init__.py:241: options.Spikes = Options('style', color='black', cmap='fire')

@jbednar jbednar mentioned this issue Nov 27, 2020
3 tasks
@jbednar
Copy link
Member

jbednar commented Dec 8, 2020

Default was changed to kbc_r in HoloViews 1.14.0.

@jbednar jbednar closed this as completed Dec 8, 2020
@philippjfr philippjfr removed this from the v1.14.x milestone May 18, 2021
@philippjfr philippjfr added this to the v1.14.4 milestone May 18, 2021
Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 23, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants