-
-
Notifications
You must be signed in to change notification settings - Fork 404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
histogram op: raise error on non numerical data #5481
Conversation
Codecov Report
@@ Coverage Diff @@
## main #5481 +/- ##
==========================================
- Coverage 88.22% 88.21% -0.01%
==========================================
Files 302 302
Lines 62179 62188 +9
==========================================
+ Hits 54857 54862 +5
- Misses 7322 7326 +4
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
@@ -700,6 +700,14 @@ class histogram(Operation): | |||
Used for setting a common style for histograms in a HoloMap or AdjointLayout.""") | |||
|
|||
def _process(self, element, key=None): | |||
|
|||
bad_bins_msg = 'bins only accept numerical data.' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this message could be improved a bit. How about:
'Histogram binning can only be performed on numerical data'
(self.p.bins.dtype.kind in 'SU' or self.bins.dtype.kind == 'O' and not isdatetime(self.bins))): | ||
raise TypeError(bad_bins_msg) | ||
elif isinstance(self.p.bins, (list, tuple)) and any(not is_number(bin) for bin in self.p.bins): | ||
raise TypeError(bad_bins_msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given all these boolean expressions, it would be good to pull them out onto separate lines for clarity.
@@ -716,8 +724,16 @@ def _process(self, element, key=None): | |||
|
|||
if hasattr(element, 'interface'): | |||
data = element.interface.values(element, selected_dim, compute=False) | |||
dtype = element.interface.dtype(element, selected_dim) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is dtype
not easily available with compute=False
?
Made a few comments and overall looks good. Could you add a test to see if the type error is raised when expected? |
I can see that the improved error message requested for categorial data in holoviz/hvplot#906 have been solved in #5540. E.g, this code now gives a more descriptive error message: import plotly.express as px
import hvplot.pandas
df = px.data.tips()
df.hvplot.hist("day", height=400) I can't at a quick glance see if this PR also fixes other issues with categorical data in histograms. |
No answer in a year, so I will close this PR. |
This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Fixes holoviz/hvplot#906 in hvPlot
The main goal of this PR was to validate the data to be processed by the
histogram
operation, that data must be numerical/datetime-like to make sense. Guarding is tricky in HoloViews with all the data types supported, I hope this is alright (the tests passed locally). While I was at it I've also done the same with for thebins
parameter.