Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use contourpy for contour calculations #5910

Merged
merged 15 commits into from
Oct 16, 2023
Merged

Use contourpy for contour calculations #5910

merged 15 commits into from
Oct 16, 2023

Conversation

ianthomas23
Copy link
Member

This is a WIP to use ContourPy for contour calculations. It was originally intended just to fix #5899, caused by a change within Matplotlib of how contour data is internally stored, but it is easier and better just to use ContourPy directly. This is faster and avoids unnecessary generation and manipulation of extra contour data such as Matplotlib path codes.

Notes:

  1. ContourPy must be available of course, but it is a compulsory dependency of both Bokeh and Matplotlib.
  2. Processing of the returned contour line and filled polygon data is performed separately as the data formats differ.
  3. Matplotlib is still used for conversion of dates to and from numbers. It may be possible to replace these with NumPy or Pandas equivalents.
  4. Matplotlib is also used for auto generation of required contour levels when the levels kwarg is an integer. This currently uses the MaxNLocator class. The required functionality could be directly vendored within HoloViews to completely remove the need for Matplotlib, e.g. if using the Bokeh backend.
  5. There are issues using dates/times for x, y, and/or z arrays. There were no tests of these so have added some. Unfortunately some of these new tests such as test_image_contours_x_datetime pass regardless of the comparison contour in the assertEqual. I suspect that assertEqual silently ignores datetimes.
  6. Datetime z data implies datetime levels too. To fully support this we may not be able to rely on MaxNLocator but might need to also use or implement some of Matplotlib's DateLocator functionality.

except ImportError:
raise ImportError("contours operation requires matplotlib.")
extent = element.range(0) + element.range(1)[::-1]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer used as I don't think it is required. There is explicit code to set the x and y arrays and pass them through, so the extent kwarg would have been ignored.


offsets = lines[1][0]
if len(offsets) > 2:
# Casting offsets to int64 to avoid possible numpy UFuncOutputCastingError
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a concern. It isn't needed for test_operation.py but is for one of the KDE tests in test_statsoperations.py.

Copy link
Member

@hoxbro hoxbro Oct 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the type of the KDE test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With a clean environment this is now needed to avoid the "UFunc add" error in many tests (test_operation.py as well as test_statsoperations.py). The complaining line is in numpy function_base.py when adding extra values to these uint32 offsets. The values added could in theory be negative (although they are always positive in practice) and hence it doesn't like the unsigned-ness of uint32. int64 is the next largest signed integer, so it seems a sensible choice.

@codecov-commenter
Copy link

codecov-commenter commented Sep 28, 2023

Codecov Report

Merging #5910 (ba342a5) into main (912d520) will decrease coverage by 0.13%.
Report is 5 commits behind head on main.
The diff coverage is 95.83%.

@@            Coverage Diff             @@
##             main    #5910      +/-   ##
==========================================
- Coverage   88.58%   88.45%   -0.13%     
==========================================
  Files         313      315       +2     
  Lines       65066    65442     +376     
==========================================
+ Hits        57636    57887     +251     
- Misses       7430     7555     +125     
Flag Coverage Δ
ui-tests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
holoviews/tests/operation/test_operation.py 98.96% <100.00%> (+0.39%) ⬆️
holoviews/tests/util/test_locator.py 100.00% <100.00%> (ø)
holoviews/operation/element.py 76.89% <92.85%> (+1.30%) ⬆️
holoviews/util/locator.py 91.56% <91.56%> (ø)

... and 68 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@hoxbro
Copy link
Member

hoxbro commented Oct 2, 2023

  1. ContourPy must be available of course, but it is a compulsory dependency of both Bokeh and Matplotlib.

Let us add it to extras_require['tests_core'] in setup.py for good measure.

  1. Processing of the returned contour line and filled polygon data is performed separately as the data formats differ.

Ok.

  1. Matplotlib is still used for conversion of dates to and from numbers. It may be possible to replace these with NumPy or Pandas equivalents.

I think it should be safe to convert it to int64. At some point, I think matplotlib used another datetime format than epoch, which is likely why custom handling was needed.

  1. Matplotlib is also used for auto generation of required contour levels when the levels kwarg is an integer. This currently uses the MaxNLocator class. The required functionality could be directly vendored within HoloViews to completely remove the need for Matplotlib, e.g. if using the Bokeh backend.

Looking at the code for MaxNLocator it should be somewhat easy to get the needed functionality directly without having to import matplotlib.

  1. There are issues using dates/times for x, y, and/or z arrays. There were no tests of these so have added some. Unfortunately some of these new tests such as test_image_contours_x_datetime pass regardless of the comparison contour in the assertEqual. I suspect that assertEqual silently ignores datetimes.

I think assertEqual converts it before comparing it. I would suggest using pandas or numpy testing modules directly.

  1. Datetime z data implies datetime levels too. To fully support this we may not be able to rely on MaxNLocator but might need to also use or implement some of Matplotlib's DateLocator functionality.

Wouldn't a conversion of datetime data to int64 solve this? Maybe not perfect, but good enough. We can always add a more complex implementation if there is a demand for it.

@ianthomas23
Copy link
Member Author

I've added a minimal MaxNLocator implementation. It will need unit tests.

@ianthomas23
Copy link
Member Author

For the conversion of datetimes to numbers, shouldn't it be floats rather than ints to include fractional seconds? For example something like

as_floats = some_datetimes.astype('datetime64[s]').astype(np.float64)
back_to_datetimes = as_floats.astype('datetime64[s]')

@philippjfr
Copy link
Member

Really appreciate this @ianthomas23, I've been meaning to do this ever since you first mentioned contourpy.

@ianthomas23
Copy link
Member Author

ianthomas23 commented Oct 10, 2023

The low-level contour operation now works as expected. I have replaced use of the Matplotlib functions num2date and date2num with our own simplified equivalents so now we only need ContourPy to calculate contours rather then Matplotlib. I have split out the x and y coordinates into separate arrays as that is how they are handled internally in HoloViews and it allows us to use different dtypes for each (e.g. datetime for one of the two). This does not work for holes in polygons (and never has) as the x and y coordinates are stacked together, so now I am detecting such use and raising a RuntimeError so that there is a decent error message presented to the user.

More work is needed though. All the unit tests are passing, but if you try out a filled contour plot in Jupyter using one datetime dimension then it fails as there are other places in the code where x and y coords are stacked for convenience and the different dtypes cause problems. See here where we have a column_stack call:

def multi_polygons_data(element):
"""
Expands polygon data which contains holes to a bokeh multi_polygons
representation. Multi-polygons split by nans are expanded and the
correct list of holes is assigned to each sub-polygon.
"""
xs, ys = (element.dimension_values(kd, expanded=False) for kd in element.kdims)
holes = element.holes()
xsh, ysh = [], []
for x, y, multi_hole in zip(xs, ys, holes):
xhs = [[h[:, 0] for h in hole] for hole in multi_hole]
yhs = [[h[:, 1] for h in hole] for hole in multi_hole]
array = np.column_stack([x, y])
splits = np.where(np.isnan(array[:, :2].astype('float')).sum(axis=1))[0]
arrays = np.split(array, splits+1) if len(splits) else [array]
multi_xs, multi_ys = [], []
for i, (path, hx, hy) in enumerate(zip(arrays, xhs, yhs)):
if i != (len(arrays)-1):
path = path[:-1]
multi_xs.append([path[:, 0]]+hx)
multi_ys.append([path[:, 1]]+hy)
xsh.append(multi_xs)
ysh.append(multi_ys)
return xsh, ysh

@hoxbro hoxbro added this to the 1.18.0 milestone Oct 10, 2023
@ianthomas23
Copy link
Member Author

This is functionally complete and ready for review. I have kept the use of Matplotlib for conversions between datetimes and floats but the attempted import matplotlib only occurs if such a conversion is required. Datetimes can be used for contour line plots, but using datetimes for x or y coordinates of filled contour plots raises a RuntimeError as it is not supported.

I have added some extra tests.

Longer term we can consider replacing the use of Matplotlib's datetime to float conversions with our own. I started looking at this but it turned out to be more work than this conversion to ContourPy is and hence should not be attempted as part of this PR.

Example:

import numpy as np
import holoviews as hv
hv.extension('bokeh')

x = np.linspace(0.0, 10.0, 40)
y = np.linspace(0.0, 2*np.pi, 20)
x2, y2 = np.meshgrid(x, y)
z = np.cos(x2 - 0.2*y2)*np.sin(y2)
img = hv.Image((x, y, z))
hv.operation.contours(img, filled=True).opts(cmap="rainbow4", colorbar=True, width=800)
Screenshot 2023-10-13 at 11 23 48

Copy link
Member

@hoxbro hoxbro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have left some minor comments. But overall it looks great.

holoviews/operation/element.py Show resolved Hide resolved
holoviews/tests/operation/test_statsoperations.py Outdated Show resolved Hide resolved
holoviews/tests/operation/test_operation.py Outdated Show resolved Hide resolved
holoviews/tests/operation/test_operation.py Show resolved Hide resolved
holoviews/tests/operation/test_operation.py Outdated Show resolved Hide resolved
holoviews/util/locator.py Show resolved Hide resolved
holoviews/util/locator.py Show resolved Hide resolved
holoviews/tests/operation/test_operation.py Outdated Show resolved Hide resolved
@hoxbro
Copy link
Member

hoxbro commented Oct 16, 2023

Thank you for modernizing the contouring code to use state-of-the-art algorithms 👍

@hoxbro hoxbro merged commit 3da3d75 into main Oct 16, 2023
10 checks passed
@hoxbro hoxbro deleted the contours_via_contourpy branch October 16, 2023 12:09
Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 23, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Contours not displaying properly with matplotlib 3.8
4 participants