Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add .compute() to null filter (line 887) of extrapolate_water_elevation_to_dry_areas() #144

Merged
merged 2 commits into from
Aug 13, 2024

Conversation

FariborzDaneshvar-NOAA
Copy link
Collaborator

extrapolate_water_elevation_to_dry_areas() function was failing with this KeyError:

KeyError                                  Traceback (most recent call last)
Cell In[44], line 13
     10     training_set_adjusted = training_set.copy(deep=True)
     11 else:
     12     # make an adjusted training set for dry areas..
---> 13     training_set_adjusted = extrapolate_water_elevation_to_dry_areas(
     14         da=training_set,
     15         k_neighbors=k_neighbors,
     16         idw_order=idw_order,
     17         compute_headloss=True,
     18         mann_coef=mann_coef,
     19         u_ref=0.4,
     20         d_ref=1,
     21     )

File /nhc/Fariborz.Daneshvar/miniconda3/envs/nhc_colab/lib/python3.10/site-packages/ensembleperturbation/parsing/adcirc.py:890, in extrapolate_water_elevation_to_dry_areas(da, k_neighbors, idw_order, compute_headloss, mann_coef, u_ref, d_ref, min_depth)
    888 tree = KDTree(projected_coordinates[~null])
    889 dd, nn = tree.query(projected_coordinates[null], k=k_neighbors)
--> 890 max_allowable_values = da['depth'][null].values + min_depth - numpy.finfo(float).eps
    891 if k_neighbors == 1:
    892     headloss = dd * friction_factor  # hydraulic friction loss

File /nhc/Fariborz.Daneshvar/miniconda3/envs/nhc_colab/lib/python3.10/site-packages/xarray/core/dataarray.py:823, in DataArray.__getitem__(self, key)
    820     return self._getitem_coord(key)
    821 else:
    822     # xarray-style array indexing
--> 823     return self.isel(indexers=self._item_key_to_dict(key))

File /nhc/Fariborz.Daneshvar/miniconda3/envs/nhc_colab/lib/python3.10/site-packages/xarray/core/dataarray.py:1413, in DataArray.isel(self, indexers, drop, missing_dims, **indexers_kwargs)
   1410 indexers = either_dict_or_kwargs(indexers, indexers_kwargs, "isel")
   1412 if any(is_fancy_indexer(idx) for idx in indexers.values()):
-> 1413     ds = self._to_temp_dataset()._isel_fancy(
   1414         indexers, drop=drop, missing_dims=missing_dims
   1415     )
   1416     return self._from_temp_dataset(ds)
   1418 # Much faster algorithm for when all indexers are ints, slices, one-dimensional
   1419 # lists, or zero or one-dimensional np.ndarray's

File /nhc/Fariborz.Daneshvar/miniconda3/envs/nhc_colab/lib/python3.10/site-packages/xarray/core/dataset.py:2704, in Dataset._isel_fancy(self, indexers, drop, missing_dims)
   2700 var_indexers = {
   2701     k: v for k, v in valid_indexers.items() if k in var.dims
   2702 }
   2703 if var_indexers:
-> 2704     new_var = var.isel(indexers=var_indexers)
   2705     # drop scalar coordinates
   2706     # https://github.com/pydata/xarray/issues/6554
   2707     if name in self.coords and drop and new_var.ndim == 0:

File /nhc/Fariborz.Daneshvar/miniconda3/envs/nhc_colab/lib/python3.10/site-packages/xarray/core/variable.py:1378, in Variable.isel(self, indexers, missing_dims, **indexers_kwargs)
   1375 indexers = drop_dims_from_indexers(indexers, self.dims, missing_dims)
   1377 key = tuple(indexers.get(dim, slice(None)) for dim in self.dims)
-> 1378 return self[key]

File /nhc/Fariborz.Daneshvar/miniconda3/envs/nhc_colab/lib/python3.10/site-packages/xarray/core/variable.py:899, in Variable.__getitem__(self, key)
    886 def __getitem__(self: T_Variable, key) -> T_Variable:
    887     """Return a new Variable object whose contents are consistent with
    888     getting the provided key from the underlying data.
    889 
   (...)
    897     array `x.values` directly.
    898     """
--> 899     dims, indexer, new_order = self._broadcast_indexes(key)
    900     data = as_indexable(self._data)[indexer]
    901     if new_order:

File /nhc/Fariborz.Daneshvar/miniconda3/envs/nhc_colab/lib/python3.10/site-packages/xarray/core/variable.py:732, in Variable._broadcast_indexes(self, key)
    729 if all(isinstance(k, BASIC_INDEXING_TYPES) for k in key):
    730     return self._broadcast_indexes_basic(key)
--> 732 self._validate_indexers(key)
    733 # Detect it can be mapped as an outer indexer
    734 # If all key is unlabeled, or
    735 # key can be mapped as an OuterIndexer.
    736 if all(not isinstance(k, Variable) for k in key):

File /nhc/Fariborz.Daneshvar/miniconda3/envs/nhc_colab/lib/python3.10/site-packages/xarray/core/variable.py:784, in Variable._validate_indexers(self, key)
    779     raise IndexError(
    780         "{}-dimensional boolean indexing is "
    781         "not supported. ".format(k.ndim)
    782     )
    783 if is_duck_dask_array(k.data):
--> 784     raise KeyError(
    785         "Indexing with a boolean dask array is not allowed. "
    786         "This will result in a dask array of unknown shape. "
    787         "Such arrays are unsupported by Xarray."
    788         "Please compute the indexer first using .compute()"
    789     )
    790 if getattr(k, "dims", (dim,)) != (dim,):
    791     raise IndexError(
    792         "Boolean indexer should be unlabeled or on the "
    793         "same dimension to the indexed array. Indexer is "
   (...)
    796         )
    797     )

KeyError: 'Indexing with a boolean dask array is not allowed. This will result in a dask array of unknown shape. Such arrays are unsupported by Xarray.Please compute the indexer first using .compute()'

So I added .compute() to line 887, and with that change, the issue was resolved.
I tested it locally for the same dataset that I used for technical report and it worked fine.

Copy link
Contributor

@WPringle WPringle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me, change matches error description

@FariborzDaneshvar-NOAA
Copy link
Collaborator Author

@SorooshMani-NOAA This error only happens when da=training_set has been loaded in chunks:

xarray.open_dataset(subset_filename, chunks='auto')

I realized that previously, I was not using chunks='auto', and that's why I was not getting this error!

xarray.open_dataset(subset_filename)  --> extrapolate_water_elevation_to_dry_areas() works fine!
xarray.open_dataset(subset_filename, chunks='auto')  --> extrapolate_water_elevation_to_dry_areas() fails, and needs ".compute()" in line 887!

@SorooshMani-NOAA
Copy link
Collaborator

This error only happens when da=training_set has been loaded in chunks

@FariborzDaneshvar-NOAA yes, this is because when you open with chunks the backend is a dask array, but if you open without it it'll be a simple numpy array. The error is that dask arrays cannot be used as boolean masks, such as null variable in the code.

@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 0% with 1 line in your changes missing coverage. Please review.

Project coverage is 20.94%. Comparing base (2cd962e) to head (f66f9c4).

Files Patch % Lines
ensembleperturbation/parsing/adcirc.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #144   +/-   ##
=======================================
  Coverage   20.94%   20.94%           
=======================================
  Files          28       28           
  Lines        3987     3987           
=======================================
  Hits          835      835           
  Misses       3152     3152           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@SorooshMani-NOAA SorooshMani-NOAA merged commit a2e545b into main Aug 13, 2024
10 checks passed
@SorooshMani-NOAA SorooshMani-NOAA deleted the fix_extrapolate_water branch August 13, 2024 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants