Support datashade points hover #1430

ahuang11 · 2024-10-02T23:44:38Z

Adds support for hovering over datashaded points

import pandas as pd
import hvplot.pandas
import datashader as ds

df = pd.DataFrame(
    {
        "lon": [-86.75, -86.75, -86.25, -86.25],
        "lat": [33.75, 34.0, 34.49, 34.5],
        "population": [100, 200, 300, 400],
        "c": ["A", "B", "A", "B"],
    }
)

p = df.hvplot.points(
    "lon",
    "lat",
    hover_cols="all",
    datashade=True,
    dynspread=True,
    aggregator=ds.count_cat("c"),
    hover_tooltips=["lat", "lon", ("Population", "@population"), ("Alpha", "@A"), ("Beta", "@B")],  # not necessary
    padding=0.2
)
p

Area.mp4

ahuang11 · 2024-10-03T00:21:33Z

Tap instead of PointerXY for big datasets

Screen.Recording.2024-10-02.at.5.21.53.PM.mov

codecov · 2024-10-03T00:45:50Z

Codecov Report

Attention: Patch coverage is 46.15385% with 35 lines in your changes missing coverage. Please review.

Project coverage is 88.57%. Comparing base (b36e3a1) to head (c792ae2).
Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
hvplot/converter.py	36.36%	35 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1430      +/-   ##
==========================================
- Coverage   88.94%   88.57%   -0.37%     
==========================================
  Files          52       52              
  Lines        7751     7808      +57     
==========================================
+ Hits         6894     6916      +22     
- Misses        857      892      +35

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

hoxbro

Some comments:

Does this work with other data backends like dask.dataframe?
This is a (small) breaking change as the output of the DynamicMap generated by hvplot (with datashade=True, hover=True) is no longer an RGB but an Overlay. This could lead to problems in the future.
We could add inspect as an option to hover to enable this feature for now. Then, we can always make it the default in the next minor release.

hoxbro · 2024-10-03T06:15:19Z

hvplot/converter.py

+                )
+
+            stream = PointerXY
+            if len(self.data) > 10000:


I am not a fan of these magic numbers, with no way to change it.

Yes allowing changing that value is a good improvement. I wouldn't add that to the signature though, maybe just as a very explicitly named kwarg that is caught there if defined, and defaults to 10000 if not.

hoxbro · 2024-10-03T06:17:04Z

hvplot/converter.py

+            for i in range(1, 10):
+                if key in df.columns:
+                    key = f'Count_{i}'
+                else:
+                    break


This looks weird to me

What's weird about it? It's deduplicating columns; do you have a suggestion for an alternative?

Why 10?

I can't wrap my head around what the code does.

Just about any scalar value other than 0 or 1 should be documented very clearly, usually by making a Parameter or at least attribute declaration with a meaningful name and associated docstring.

hoxbro · 2024-10-03T06:22:56Z

hvplot/converter.py

+                    break
+            agg_value = pd.Series([len(df)], index=[key])
+
+        # take the mean of numeric columns


Is the rest of the method generated with AI? If so, please clearly state so.

Otherwise, please explain why you take the mean. In general, comments should not explain what is done but why it is done.

No it's not AI. I decided to add comments since the code was getting chunky, and easier to navigate English, especially since I use the exclude keyword for object columns instead of include keyword (had to do a double take that when I was rereading my code), and I wanted to highlight that.

Comments should not explain what is done but why it is done.

Sometimes it's easier to read English than code and guidelines don't necessarily have to be followed all the time

Sorry about that. It just had the structure of an AI output.

... guidelines don't necessarily have to be followed all the time.

I agree, but the use of mean need to be explained, either in the code or in the original comment.

This can return an output that does not exist in the original DataFrame when more points are in the input DataFrame. For example, the two close points in your example will return 350 for the population when they are close to each other. Is this okay? Maybe, but you need to state that you have made this decision.

@ahuang11 , @hoxbro thinks you are so intelligent that it must be artificial! :-)

hoxbro · 2024-10-03T06:49:36Z

hvplot/converter.py

@@ -842,13 +842,22 @@ def __init__(
        if kind == 'errorbars':
            hover = False
        elif hover is None:
-            hover = not self.datashade
+            hover = True


Suggested change

hover = True

hover = (self.datashade and self.kind == 'points') or not self.datashade

I'm not sure if hover should always be True.

ahuang11 · 2024-10-03T07:20:07Z

hvplot/converter.py

+        agg_series_map[agg_col] = agg_value
+
+        # concat all series into a single dataframe which has one row
+        df_hover = pd.concat(agg_series_map.values()).to_frame().transpose()


This especially, without the comment, would be really confusing of what all the methods are doing.

Agree. But why is this needed? Why is a pd.Series not good enough? We could add an extra line explaining this:

Suggested change

df_hover = pd.concat(agg_series_map.values()).to_frame().transpose()

# This is needed as the function used in `inspect_points.transform` must return a DataFrame.

df_hover = pd.concat(agg_series_map.values()).to_frame().transpose()

At some point, a person could look at this line and remove .to_frame().transpose(), and if no tests fails people could think it was not a needed conversion.

philippjfr · 2024-10-03T15:11:55Z

This is a (small) breaking change as the output of the DynamicMap generated by hvplot (with datashade=True, hover=True) is no longer an RGB but an Overlay. This could lead to problems in the future.

Yes, I haven't reviewed in detail but if this is the approach taken we should wait for https://github.com/orgs/holoviz/projects/9 to be addressed properly rather than "hacking" it with a fake layer at the hvplot layer.

maximlt · 2024-10-04T12:57:59Z

hvplot/converter.py

+                )
+
+            stream = PointerXY
+            if len(self.data) > 10000:


Yes allowing changing that value is a good improvement. I wouldn't add that to the signature though, maybe just as a very explicitly named kwarg that is caught there if defined, and defaults to 10000 if not.

maximlt · 2024-10-04T12:58:20Z

hvplot/converter.py

+
+        # show at least the x and y columns
+        cols = self.hover_cols.copy()
+        if self.x not in cols:


Is self.x always defined or can it be None at this stage?

maximlt · 2024-10-04T12:59:27Z

hvplot/tests/testoperations.py

@@ -324,6 +319,23 @@ def test_downsample_resample_when(self, kind, eltype):
        assert isinstance(element, eltype)
        assert len(element) == 0

+    @parameterized.expand([(None,), (True,), ('vline',), ('hline',)])


Do you think there'd be a way to avoid calling get_plot? It's pretty expensive isn't it?

maximlt · 2024-10-04T13:00:38Z

hvplot/tests/testoperations.py

@@ -324,6 +319,23 @@ def test_downsample_resample_when(self, kind, eltype):
        assert isinstance(element, eltype)
        assert len(element) == 0

+    @parameterized.expand([(None,), (True,), ('vline',), ('hline',)])


Many more tests will be needed to cover all the branches of the code introduced in the converter, these first two are a good start but they're very basic and just check whether the code doesn't error and the HoloViews type is the expected one.

ahuang11 · 2024-10-08T16:25:28Z

this is the approach taken we should wait for https://github.com/orgs/holoviz/projects/9 to be addressed properly rather than "hacking" it with a fake layer at the hvplot layer.

To clarify, which tasks in https://github.com/orgs/holoviz/projects/9 should be completed first? Should we ditch this PR?

ahuang11 added 3 commits October 2, 2024 16:42

Support datashade hover

dadec84

add test

ea1d123

use tap if large ds

f849d5d

ahuang11 marked this pull request as ready for review October 3, 2024 00:20

ahuang11 added the type: enhancement New feature or request label Oct 3, 2024

add default count

e947131

ahuang11 requested review from maximlt and hoxbro October 3, 2024 00:30

rm test

c792ae2

hoxbro reviewed Oct 3, 2024

View reviewed changes

ahuang11 commented Oct 3, 2024

View reviewed changes

maximlt reviewed Oct 4, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support datashade points hover #1430

Support datashade points hover #1430

ahuang11 commented Oct 2, 2024 •

edited

Loading

ahuang11 commented Oct 3, 2024

codecov bot commented Oct 3, 2024

hoxbro left a comment

hoxbro Oct 3, 2024

maximlt Oct 4, 2024

hoxbro Oct 3, 2024

ahuang11 Oct 3, 2024

hoxbro Oct 3, 2024

jbednar Oct 3, 2024

hoxbro Oct 3, 2024

ahuang11 Oct 3, 2024 •

edited

Loading

hoxbro Oct 3, 2024

jbednar Oct 3, 2024

hoxbro Oct 3, 2024

ahuang11 Oct 3, 2024

hoxbro Oct 3, 2024

philippjfr commented Oct 3, 2024

maximlt Oct 4, 2024

maximlt Oct 4, 2024

maximlt Oct 4, 2024

maximlt Oct 4, 2024

ahuang11 commented Oct 8, 2024

	hover = True
	hover = (self.datashade and self.kind == 'points') or not self.datashade

	df_hover = pd.concat(agg_series_map.values()).to_frame().transpose()
	# This is needed as the function used in `inspect_points.transform` must return a DataFrame.
	df_hover = pd.concat(agg_series_map.values()).to_frame().transpose()

Support datashade points hover #1430

Are you sure you want to change the base?

Support datashade points hover #1430

Conversation

ahuang11 commented Oct 2, 2024 • edited Loading

ahuang11 commented Oct 3, 2024

codecov bot commented Oct 3, 2024

Codecov Report

hoxbro left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ahuang11 Oct 3, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

philippjfr commented Oct 3, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ahuang11 commented Oct 8, 2024

ahuang11 commented Oct 2, 2024 •

edited

Loading

ahuang11 Oct 3, 2024 •

edited

Loading