LogTicker broken for colormaps that include zero #8061

jbednar · 2018-07-07T16:25:15Z

At least for Bokeh 0.13.0 and earlier, adding a LogTicker() to
color_data_map.py makes the numerical colorbar labels disappear, making it not usable for the intended purpose of a colorbar (showing the mapping from colors to numerical values):

color_bar = ColorBar(color_mapper=mapper, location=(0, 0))

color_bar = ColorBar(color_mapper=mapper, location=(0, 0), ticker=LogTicker())

The problem can be avoided by using a lower bound > 0 on the colorbar:

p1 = make_plot(LinearColorMapper(palette=Viridis256, low=1, high=100), title='Viridis256 - Linear, low/high = blue/red')
p2 = make_plot(LogColorMapper(palette=Viridis256, low=1, high=100), title='Viridis256 - Log, low/high = blue/red')

But because having a lower bound of 0 is not a problem for the actual colormapping, it doesn't seem like it should be an issue for the colorbar's ticker, right? Presumably the colorbar is using log1p, while the ticker is trying to take the regular log of 0 and failing. Can we simply have the ticker start at 1 when the colorbar starts at 0?

You might argue that we should instead be using a lower bound of 1 for the colormapping, but allowing 0 values for colormapping is an extremely important feature for us with datashader and other types of heatmaps that are plotting counts, because we want to use log colormapping to handle large counts but many cells in a heatmap or raster plot are zero because a count of zero is also common. In any case, I think the ticker behavior should match the colormapping behavior, as the purpose of a ticker is to reveal the colormap.

The text was updated successfully, but these errors were encountered:

bryevdv · 2018-07-09T04:58:24Z

Can we simply have the ticker start at 1 when the colorbar starts at 0?

I'm not sure what you are suggesting, concretely. The log ticker doesn't know that it is being used by a color bar. The only way to achieve what you suggest would be to have the colorbar actively intervene and override whatever values the colorbar is set to to begin with, after everything is initialized. But that would be an unusual and atypical thing to do, and, I think, lead to unexpected behavior. It seems like a better and more general solution is just to have a log ticker never try to return ticks for values <= 0, regardless of what the requested range is.

bryevdv · 2018-07-09T05:00:22Z

Or maybe you mean that the colorbar should set its range to avoid this situation? It would clarify things to have a PR with a concrete propose change to evaluate.

jbednar · 2018-07-09T09:41:30Z

I'm only proposing that it work, nor precisely how to do it. :-) Not returning a tick for values <=0 would be a good start, but then it seems like the bottom of the range would be missing a tick. But maybe it could first clip the requested range to something where log is valid, then spreat the ticks over that? Or else just use log1p instead of log?

bryevdv · 2018-07-09T17:53:24Z

Or else just use log1p instead of log?

I'm not sure how that would work for the ticker, AFAICT the color mapper uses log1p to define a relative scale (difference in orders of magnitude) at a certain place, but the ticker ultimately has to contend with an absolute range, whatever it actually happens to be. But maybe I am missing something.

It seems the best way to address this is to just have a way to make the colorbar not include zero in its range, when including zero would cause problems. This could be an explicit property (e.g. skip_zero=True) that has to be set on the color bar, or perhaps we could try to make an adjustment to the internally created ranges based on the ticker type:

https://github.com/bokeh/bokeh/blob/master/bokehjs/src/lib/models/annotations/color_bar.ts#L660

I'm happy to entertain a PR for either, you guys know what best will support your needs.

xhongyi · 2018-12-15T16:08:44Z

Up voting this issue. Not working when low is 0 is disappointing.

giogit · 2019-03-29T11:32:34Z

Up voting this issue

bryevdv · 2019-09-28T02:58:43Z

So after doing a survey of Altair, Plotly, and Matplotlib I am unable to find any evidence that any of them support putting 0 on a logaxis, except for MPL's symlog which is both generally regarded as bad, and definitely not appropriate for color bars in any case.

Anyone advocating for this will need to supply some references to other tools or libraries that do allow zero so that their policies can be considered and studied. Otherwise I am inclined to close this issue with noaction

I do think we should have color bar log scales stop using log1p for correctness and to be consistent with other log scales, but there is another issue for that.

jbednar · 2019-09-30T18:48:43Z

I'm not certain that the other issue (#8724?) covers everything, as this issue is about a problem with the ticker rather than the colormap itself, but maybe you're right that this problem will go away when the other issue is addressed.

It might be helpful to clarify that for Datashader use cases, wanting a logarithmic mapping that includes 0 comes up primarily because there is no NaN available for integer types. For floating-point types, Datashader uses NaN for areas with no data, which I believe works fine already, but that option is not available for integer types. So while doing a log of 0 doesn't make any sense mathematically, it can make sense as a workaround for the limitations of integer types on computers.

With that in mind, would it be meaningful to somehow mask out the zero values for integer arrays to avoid the problem instead? Datashader can't really do this internally, as a user who has asked for an int64 array should get one (which thus cannot represent NaN), but in HoloViews I suppose we could convert the data to float64 and replace zeros with NaNs before giving the array to Bokeh. Or maybe Bokeh could do that before plotting, i.e. to mask out zero values for integer arrays used with log? I'm pretty sure I haven't thought of all the implications of the various options here; I just know that it's an issue.

bryevdv · 2019-09-30T23:43:59Z

Allowing a user-defined sentinel value of some sort (which could be 0, for that matter, if so desired) that gets converted to a color directly before any other transformation seems more reasonable to accommodate that case than inventing new art that does not seem to exist anywhere else. Then the question is: how do you convey what this unique color represents to users (e.g. because it won't necessarily show up in the colorbar; say you map 0 to bright pink, 0 will not be on the colorbar). I think we could leave that to the user to supply in annotations, subtitles or accompanying text, though.

jbednar · 2019-10-01T01:39:07Z

I keep coming back to the specific, extremely well defined case that's the starting point: an array of whole-number counts, specifically used as a 2D histogram in the case of Datashader, but in general any array of whole-number counts would have the same property -- (a) the counts are often distributed logarithmically, making a linear colormap be a poor representation, and (b) a count of zero often occurs. Under those conditions, 0 shouldn't have some arbitrary color; if it's mapped to a non-transparent color, that color should normally be the bottom of a color palette, with increasing values mapping to other colors in the same color palette. So having to display or explain some arbitrary color wouldn't come up; 0 has a very natural mapping to the first color already.

bryevdv · 2019-10-01T02:33:58Z

It sounds like a solution that works for you then? You just choose to map zero to the first bin color.

jbednar · 2019-10-01T13:28:05Z

Sure!

poplarShift · 2019-10-03T13:27:57Z

@bryevdv @jbednar

I think the only use case where a log-transform on data including zero could be consistently advocated for is when you have data that is in fact lognormally (or similar) distributed, but the (measurement) noise is such that some values happen to be "too close" to zero or even below.

However, I would personally argue that in such cases, it is up to the user to do the right data analysis, find the right lower limit for the color map, and map the rest to some outlier value. Bokeh shouldn't be expected to magically do that.

jbednar · 2019-10-03T14:17:57Z

I'm not focusing on cases where "measurement noise" is an issue; my concern is for integer count values, such as those in http://datashader.org/topics/census.html (which has a detailed analysis of why a linear mapping is inappropriate and why a log (well, log1p) mapping reveals the data much more faithfully). For counts, zero is a valid value, not a measurement error, and should map to the lowest slot on the color scale. For floating point values, the situation is quite different and very poorly defined, but luckily in those cases NaN is available to avoid this problem. Users can already mask out the problematic values with NaN, and Bokeh doesn't need to do anything there, but that option isn't available for integer arrays.

jbednar · 2019-10-03T14:19:27Z

Oh, and simply having the user add 1 to the whole integer array to avoid the problem isn't appropriate either, because then the incorrect count values will be shown on the color bar and in hover. The fact that other libraries haven't addressed this problem doesn't make it any less of a problem!

poplarShift · 2019-10-03T18:25:22Z

Meanwhile: "Do not log‐transform count data" :)

I am not an expert but why could your count data not be modelled and then transformed e.g. with a Poisson distribution, or whatever distribution accurately describes the data? Isn't that what histogram equalization does (very roughly) in the end?

bryevdv · 2019-10-03T18:28:47Z

@jbednar @poplarShift while I think this an interesting conversation, I think it might be better suited to continue on the Datashader issue tracker.

I think we have a clear course of action for this issue now: provide a mechanism to supply mappings from distinguished values to arbitrary colors, making sure that those mappings short circuit any subsequent computations. That's the affordance, and then everyone use it however suits their needs.

This was referenced Jul 7, 2018

ColorBar ticker set to linear if low value <=0 holoviz/holoviews#2865

Merged

colorbar=True is missing text values when logz=True holoviz/holoviews#2606

Closed

bryevdv added the type: discussion label Jul 9, 2018

bryevdv modified the milestone: short-term Sep 11, 2018

jbednar mentioned this issue Dec 15, 2018

Bokeh and HoloViews datashader support to-do list holoviz/datashader#669

Open

6 tasks

poplarShift mentioned this issue Mar 7, 2019

LogColorMapper maps to wrong values #8724

Closed

timothydmorton mentioned this issue Feb 5, 2020

Symlog axis scaling? #6517

Open

sashabaranov mentioned this issue Feb 5, 2020

How to Handle 0 values on log scales #6536

Open

jlstevens mentioned this issue Apr 10, 2020

Handle log colormapping change in bokeh 2.0 holoviz/holoviews#4376

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LogTicker broken for colormaps that include zero #8061

LogTicker broken for colormaps that include zero #8061

jbednar commented Jul 7, 2018 •

edited by bryevdv

Loading

bryevdv commented Jul 9, 2018 •

edited

Loading

bryevdv commented Jul 9, 2018

jbednar commented Jul 9, 2018

bryevdv commented Jul 9, 2018 •

edited

Loading

xhongyi commented Dec 15, 2018 •

edited by jbednar

Loading

giogit commented Mar 29, 2019

bryevdv commented Sep 28, 2019 •

edited

Loading

jbednar commented Sep 30, 2019

bryevdv commented Sep 30, 2019 •

edited

Loading

jbednar commented Oct 1, 2019

bryevdv commented Oct 1, 2019

jbednar commented Oct 1, 2019

poplarShift commented Oct 3, 2019

jbednar commented Oct 3, 2019

jbednar commented Oct 3, 2019 •

edited

Loading

poplarShift commented Oct 3, 2019 •

edited

Loading

bryevdv commented Oct 3, 2019 •

edited

Loading

LogTicker broken for colormaps that include zero #8061

LogTicker broken for colormaps that include zero #8061

Comments

jbednar commented Jul 7, 2018 • edited by bryevdv Loading

bryevdv commented Jul 9, 2018 • edited Loading

bryevdv commented Jul 9, 2018

jbednar commented Jul 9, 2018

bryevdv commented Jul 9, 2018 • edited Loading

xhongyi commented Dec 15, 2018 • edited by jbednar Loading

giogit commented Mar 29, 2019

bryevdv commented Sep 28, 2019 • edited Loading

jbednar commented Sep 30, 2019

bryevdv commented Sep 30, 2019 • edited Loading

jbednar commented Oct 1, 2019

bryevdv commented Oct 1, 2019

jbednar commented Oct 1, 2019

poplarShift commented Oct 3, 2019

jbednar commented Oct 3, 2019

jbednar commented Oct 3, 2019 • edited Loading

poplarShift commented Oct 3, 2019 • edited Loading

bryevdv commented Oct 3, 2019 • edited Loading

jbednar commented Jul 7, 2018 •

edited by bryevdv

Loading

bryevdv commented Jul 9, 2018 •

edited

Loading

bryevdv commented Jul 9, 2018 •

edited

Loading

xhongyi commented Dec 15, 2018 •

edited by jbednar

Loading

bryevdv commented Sep 28, 2019 •

edited

Loading

bryevdv commented Sep 30, 2019 •

edited

Loading

jbednar commented Oct 3, 2019 •

edited

Loading

poplarShift commented Oct 3, 2019 •

edited

Loading

bryevdv commented Oct 3, 2019 •

edited

Loading