Skip to content
This repository has been archived by the owner on Jun 3, 2024. It is now read-only.

Add color_range argument to scatter #72

Closed
gte620v opened this issue May 8, 2019 · 17 comments
Closed

Add color_range argument to scatter #72

gte620v opened this issue May 8, 2019 · 17 comments

Comments

@gte620v
Copy link

gte620v commented May 8, 2019

Being able to set the color range for continuous color columns would be very useful. e.g.:

px.scatter(
           tips, x="total_bill", y="tip", 
           color="size", facet_col="sex",
           color_continuous_scale=px.colors.sequential.Viridis, 
           color_min=0, color_max=10 # <--- new args support requested
)

Currently the behavior seems to use different color scales across row and column facets, which means that the single color-scale legend is inaccurate.

@nicolaskruchten
Copy link
Contributor

Currently the behavior seems to use different color scales across row and column facets

I don't believe that's the case, and if it were it would be a bug... Can you provide an example?

The code that internally does this computation is:

https://github.com/plotly/plotly_express/blob/master/plotly_express/_core.py#L657-L665

I should note that this will change once plotly.js 1.48 comes out, which supports native cross-trace colorscale sharing.

Either way, if you have an example where px is not correctly sharing colorscales across facets I'd love to see it!

@nicolaskruchten
Copy link
Contributor

The example you've posted seems to do the right thing for me atm without the new arguments:

image

@nicolaskruchten
Copy link
Contributor

Here's a clearer version actually:
image

@gte620v
Copy link
Author

gte620v commented May 8, 2019

import plotly_express as px
import pandas as pd
data = {
    "x": {
        0: 36,
        1: 8,
        2: 35,
        3: 2,
        4: 25,
        7: 29,
        8: 9,
        9: 13,
        10: 15,
        12: 19,
        13: 21,
        14: 6,
        15: 4,
        16: 33,
        17: 1,
        18: 5,
        19: 17,
        20: 11,
        21: 23,
        22: 27,
        23: 31,
    },
    "row": {
        0: 6.0,
        1: 6.0,
        2: 6.0,
        3: 6.0,
        4: 6.0,
        7: 6.0,
        8: 6.0,
        9: 6.0,
        10: 6.0,
        12: 6.0,
        13: 6.0,
        14: 6.0,
        15: 6.0,
        16: 6.0,
        17: 6.0,
        18: 6.0,
        19: 6.0,
        20: 6.0,
        21: 6.0,
        22: 6.0,
        23: 6.0,
    },
    "col": {
        0: -5.0,
        1: 1.0,
        2: -2.0,
        3: 5.0,
        4: 3.0,
        7: -1.0,
        8: 4.0,
        9: 4.0,
        10: 0.0,
        12: 0.0,
        13: 3.0,
        14: 1.0,
        15: 5.0,
        16: -3.0,
        17: 5.0,
        18: 1.0,
        19: 0.0,
        20: 4.0,
        21: 3.0,
        22: -1.0,
        23: -3.0,
    },
    "color_y": {
        0: 0.0,
        1: 0.0,
        2: 0.0,
        3: 0.0,
        4: 0.0,
        7: 0.0,
        8: 0.04500381388253241,
        9: 0.0,
        10: 0.0625,
        12: 0.0,
        13: 0.0,
        14: 0.06666666666666667,
        15: 0.0,
        16: 0.0,
        17: 0.007744954707029009,
        18: 0.008126576654053435,
        19: 0.788848364894659,
        20: 0.9773527528809217,
        21: 0.907694404153694,
        22: 0.013756613756613755,
        23: 0.00870253164556962,
    },
}
df = pd.DataFrame(data)

px.scatter(
    df,
    x="x",
    y="color_y",
    color="color_y",
    facet_col="col",
    facet_row="row",
    color_continuous_scale=px.colors.sequential.Sunset,
)

image

The y value in each subplot should be the same as the color. But clearly, there are purple values which have very low y values and thus shouldn't be purple. Also, in the third column, there is a red dot that has a color value of zero.

@nicolaskruchten
Copy link
Contributor

Thanks! That certainly seems like a bug, I'll get cracking on fixing it :)

@gte620v
Copy link
Author

gte620v commented May 8, 2019

Great! Any chance you would add args for setting the color_range too? Seems to require only a couple lines of change. I can do a PR if that would help.

@nicolaskruchten
Copy link
Contributor

what would be the use-case, once this bug is fixed?

@gte620v
Copy link
Author

gte620v commented May 8, 2019

I think there are lots of use cases. Matplotlib scatter supports it with vmin, vmax: https://matplotlib.org/api/_as_gen/matplotlib.pyplot.scatter.html.

Seaborn also supports it.

The main use case is to have the same color scale across multiple plots. For instance, if I have a px.Scatter of data today and then I make a new plot of updated data tomorrow, I want the color scale for those two plots to be the same. If my max color column value today is 0.5 and tomorrow it is 1.0, it will be very hard to make visual inference across the two-days worth of plots since the color scales will drastically change.

@nicolaskruchten
Copy link
Contributor

Thanks for the crisp justification, I'm convinced :)

A couple of considerations:

  1. the internal implementation of colorscales/bars/ranges will change a lot in about a week when plotly.js 1.48 comes out, so it may make sense to hold off until then
  2. what would be a good API? for the positional axes we have range_x=[min, min] so it would sort of make sense to have range_color=[min, max]... thoughts?

@gte620v
Copy link
Author

gte620v commented May 8, 2019

range_color=[min, max] sounds great. Would be nice if we could set either value in the list to None too and it would revert to default behavior.

e.g.: range_color=[0, None].

This is how plt.xlim([min, max]) works. That is, you can replace either min or max with None.

BTW, you've done a fantastic job with plotly express... It is the plotting package I have always wanted. :-)

@gte620v gte620v changed the title Add color_min/color_max arguments to scatter Add color_range arguments to scatter May 8, 2019
@gte620v gte620v changed the title Add color_range arguments to scatter Add color_range argument to scatter May 8, 2019
@nicolaskruchten
Copy link
Contributor

the None thing is unfortunately not well-supported in plotly.js so that likely won't work out of the box but we might be able to emulate it at the px level :)

And thanks for the kind words! It's also the plotting package I have always wanted (modulo the occasional 🙈 bug like this one!) and it's nice to know I'm not the only one :)

@nicolaskruchten
Copy link
Contributor

I just pushed v0.1.9 to PyPI with the fix... pushing to conda-forge in a bit :)

@nicolaskruchten
Copy link
Contributor

and up on conda now... will leave this issue as the range_color tracking issue.

@gte620v
Copy link
Author

gte620v commented May 8, 2019

Wow, impressively fast. Thanks!

@nicolaskruchten
Copy link
Contributor

(embarrassment is a strong motivator to fix things fast :P)

nicolaskruchten added a commit that referenced this issue May 30, 2019
@gte620v
Copy link
Author

gte620v commented Jun 3, 2019

🎉

@nicolaskruchten
Copy link
Contributor

(will be in the next release... hopefully today!)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants