-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Periodic Boundary Index #7031
Comments
shouldn't you implement (and call) |
I was thinking I want it to work periodically even in integer index space too - surely I need isel for that? I might be misunderstanding something fundamental here though.
|
tl;dr: Xarray There's a big difference now between
Most coordinate and data variables are still sliced via If we want to support periodic indexing with @TomNicholas your experiment makes it clear that the documentation on this part (#6975) should be improved. Thanks! |
Thank you for the clear explanation @benbovy! @keewis you were right, as
ever, but now I understand why!)
There's a big difference now between isel and sel:
So `Dataset.sel` calls `Index.sel`, but can also sometimes call
`Dataset.isel`. But `Dataset.isel` does not call `Index.isel`, nor
`Index.sel`. Yes that is definitely confusing! My understanding from
reading the docs was that every `Dataset.meth` calls the corresponding
`Index.meth`.
I'm wondering if we shouldn't deprecate that?
The situation would be a bit clearer if we did deprecate that, but I
imagine that would come with a significant backwards compatibility cost.
I think if possible each method on both Dataset and Index should have a
note in the docstring about which methods it calls or is called by, as well
as major gotchas pointed out in the custom indexes docs version via
warnings/notes.
If we want to support periodic indexing with isel, we would have to
implement that in Xarray itself.
This would be cool, but doesn't really seem worth it, at least for an
initial experiment.
Presumably I can still make a periodic index that works when doing
label-based indexing? If my Dataset had a non-dimension coordinate `"lat"`
backed by a periodic index that was a function of a dimension `"i"`, then
it would be a bit clearer why indexing with `lat` would be periodic but
indexing with `i` would not be.
…On Wed, 14 Sep 2022, 04:08 Benoit Bovy, ***@***.***> wrote:
tl;dr: Xarray Index currently supports implementing periodic indexing for
label-based indexing but not for location-based (integer) indexing.
There's a big difference now between isel and sel:
- Dataset.isel() accepts dimension names only
- Dataset.sel() accepts coordinate names (actually, it falls back to
isel when giving dimension names with no coordinate, and I'm wondering
if we shouldn't deprecate that?)
Index.isel() is convenient when the underlying index structure can be
itself sliced (like pandas.Index objects), so that users don't need to do
ds.isel(...).set_xindex(...) every time to explicitly rebuild an index
after slicing the Dataset. For a kd-tree structure that may not be
possible, i.e., KDTreeIndex.isel() would likely return None causing the
index to be dropped in the result, so there would be no way around doing
ds.isel(...).set_xindex(...).
Most coordinate and data variables are still sliced via Variable.isel(),
which doesn't involve any index. That's why you get an IndexError in your
example. (side note: the "index" / "indexing" terminology used everywhere,
for both label and integer selection, is quite confusing but I'm not sure
how this could be improved).
If we want to support periodic indexing with isel, we would have to
implement that in Xarray itself. Alternatively, it *might* be possible to
add some API in Index so that in the case of a periodic index it would
return indxr % length from indxr, which Xarray will then pass to
Variable.isel(). I'm not sure the latter is a good idea, though. Indexes
may work with arbitrary coordinates and dimensions, which would make things
too complex (handling conflicts, etc.). Also, I don't know if there's other
potential use cases besides periodic indexing?
@TomNicholas
<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_TomNicholas&d=DwMCaQ&c=009klHSCxuh5AI1vNQzSO0KGjl4nbi2Q0M1QLJX9BeE&r=qdISi9HqjazmE0DcySuXts3OlnplnLfKjH4hpzAV0xo&m=6Eurln2plZyGChJ5a5erUIWLdRQYncs9JCJEfdBqaTLUzpZQwj4Om1ykGqgvhJCV&s=qhcbnfbmpOcf9UZVZE7cvJgT6_0ywxc0ZQSKupKZHO8&e=>
your experiment makes it clear that the documentation on this part (#6975
<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_pydata_xarray_pull_6975&d=DwMCaQ&c=009klHSCxuh5AI1vNQzSO0KGjl4nbi2Q0M1QLJX9BeE&r=qdISi9HqjazmE0DcySuXts3OlnplnLfKjH4hpzAV0xo&m=6Eurln2plZyGChJ5a5erUIWLdRQYncs9JCJEfdBqaTLUzpZQwj4Om1ykGqgvhJCV&s=fqKipQbRyoJT6rZ_FvmY1pMv9dZTMo3l8q_ilmk7wbQ&e=>)
should be improved. Thanks!
—
Reply to this email directly, view it on GitHub
<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_pydata_xarray_issues_7031-23issuecomment-2D1246398771&d=DwMCaQ&c=009klHSCxuh5AI1vNQzSO0KGjl4nbi2Q0M1QLJX9BeE&r=qdISi9HqjazmE0DcySuXts3OlnplnLfKjH4hpzAV0xo&m=6Eurln2plZyGChJ5a5erUIWLdRQYncs9JCJEfdBqaTLUzpZQwj4Om1ykGqgvhJCV&s=vjffKcfZjIy1AQDf7hPZOGK3iAtaYRnWwUBfJudWTKk&e=>,
or unsubscribe
<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AISNPI4MNH3YETGDVW6UKW3V6GBYNANCNFSM6AAAAAAQL2WYZI&d=DwMCaQ&c=009klHSCxuh5AI1vNQzSO0KGjl4nbi2Q0M1QLJX9BeE&r=qdISi9HqjazmE0DcySuXts3OlnplnLfKjH4hpzAV0xo&m=6Eurln2plZyGChJ5a5erUIWLdRQYncs9JCJEfdBqaTLUzpZQwj4Om1ykGqgvhJCV&s=IDHydhJJ2sbP2i6aeTeNqjGF0fkXx9UGj3GV-4zVQTM&e=>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Yes that's indeed what I've written in #6975 and I realize now that this is confusing, especially for
So we can describe the implementation of
This omits a few implementation details (special cases for multi-index), but that's basically how it works. I think it would help if such "how label-based selection works in Xarray" high-level description was added somewhere in the "Xarray internals" documentation, along with other "how it works" sections for, e.g., alignment. |
That's very helpful, and should definitely go in the docs somewhere!
…On Wed, 14 Sep 2022, 11:31 Benoit Bovy, ***@***.***> wrote:
My understanding from reading the docs was that every Dataset.meth calls
the corresponding Index.meth.
Yes that's indeed what I've written in #6975
<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_pydata_xarray_pull_6975&d=DwMCaQ&c=009klHSCxuh5AI1vNQzSO0KGjl4nbi2Q0M1QLJX9BeE&r=qdISi9HqjazmE0DcySuXts3OlnplnLfKjH4hpzAV0xo&m=p6AEVnna74Yl7ERqNIUh2N0qdwITIr0EYT0apJ7Hbgk0KdQNUOudgu8tMxoLiNLW&s=aId1i1PHZl_Co87NrwWTHiu6metAHOBNNVBaBQt1LAo&e=>
and I realize now that this is confusing, especially for isel.
So Dataset.sel calls Index.sel, but can also sometimes call Dataset.isel.
But Dataset.isel does not call Index.isel, nor Index.sel.
So we can describe the implementation of Dataset.sel() has a two-step
procedure:
1.
remap the input dictionary {coord_name: label_values} to a dictionary {dimension_name:
int_positions}.
- This is done via calls to Index.sel() for each unique index found in
Dataset.xindexes
2.
pass the the dictionary {dimension_name: int_positions} to
Dataset.isel().
- Dataset.isel() will itself call Variable.isel() for each variable in
Dataset.variables and Index.isel() for each unique index in
Dataset.xindexes
This omits a few implementation details (special cases for multi-index),
but that's basically how it works.
I think it would help if such "how label-based selection works in Xarray"
section was added in the "Xarray internals" documentation, along with other
"how it works" section for, e.g., alignment.
—
Reply to this email directly, view it on GitHub
<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_pydata_xarray_issues_7031-23issuecomment-2D1246944137&d=DwMCaQ&c=009klHSCxuh5AI1vNQzSO0KGjl4nbi2Q0M1QLJX9BeE&r=qdISi9HqjazmE0DcySuXts3OlnplnLfKjH4hpzAV0xo&m=p6AEVnna74Yl7ERqNIUh2N0qdwITIr0EYT0apJ7Hbgk0KdQNUOudgu8tMxoLiNLW&s=Ks6PoxVcaVr7k89uOVteNHQ4LJ6gUn1ONjAA_UVC3PU&e=>,
or unsubscribe
<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AISNPIYBB6EDBNOL3ORUFKLV6HVT3ANCNFSM6AAAAAAQL2WYZI&d=DwMCaQ&c=009klHSCxuh5AI1vNQzSO0KGjl4nbi2Q0M1QLJX9BeE&r=qdISi9HqjazmE0DcySuXts3OlnplnLfKjH4hpzAV0xo&m=p6AEVnna74Yl7ERqNIUh2N0qdwITIr0EYT0apJ7Hbgk0KdQNUOudgu8tMxoLiNLW&s=ojUYzNS_we7kqIG-mT0SDi_iY4nS4fri483cGbjgHwk&e=>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I had another go and now I have this (the from xarray.core.indexes import (
PandasIndex, is_scalar, as_scalar, get_indexer_nd, IndexSelResult,
_query_slice, is_dict_like, normalize_label,
)
class PeriodicBoundaryIndex(PandasIndex):
"""
An index representing any 1D periodic numberline.
Implementation subclasses a normal xarray PandasIndex object but intercepts indexer queries.
"""
period: float
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.period = 360 # TODO work out where this input should be passed in instead of hard-coding
def _wrap_periodically(self, label_value):
return self.index.min() + (label_value - self.index.max()) % self.period
def sel(
self, labels: dict[Any, Any], method=None, tolerance=None
) -> IndexSelResult:
"""Remaps labels outside of the indexes' range back to integer indices inside the range."""
print("sel called")
from xarray import DataArray
from xarray import Variable
if method is not None and not isinstance(method, str):
raise TypeError("``method`` must be a string")
assert len(labels) == 1
coord_name, label = next(iter(labels.items()))
if isinstance(label, slice):
print(label)
indexer = _query_slice(self.index, label, coord_name, method, tolerance)
print(indexer)
elif is_dict_like(label):
raise ValueError(
"cannot use a dict-like object for selection on "
"a dimension that does not have a MultiIndex"
)
else:
label_array = normalize_label(label, dtype=self.coord_dtype)
if label_array.ndim == 0:
label_array = self._wrap_periodically(label_array)
label_value = as_scalar(label_array)
if isinstance(self.index, pd.CategoricalIndex):
if method is not None:
raise ValueError(
"'method' is not supported when indexing using a CategoricalIndex."
)
if tolerance is not None:
raise ValueError(
"'tolerance' is not supported when indexing using a CategoricalIndex."
)
indexer = self.index.get_loc(label_value)
else:
if method is not None:
print(label_array)
indexer = get_indexer_nd(
self.index, label_array, method, tolerance
)
if np.any(indexer < 0):
raise KeyError(
f"not all values found in index {coord_name!r}"
)
else:
try:
print(label_value)
indexer = self.index.get_loc(label_value)
except KeyError as e:
raise KeyError(
f"not all values found in index {coord_name!r}. "
"Try setting the `method` keyword argument (example: method='nearest')."
) from e
elif label_array.dtype.kind == "b":
indexer = label_array
else:
indexer = get_indexer_nd(self.index, label_array, method, tolerance)
if np.any(indexer < 0):
raise KeyError(f"not all values found in index {coord_name!r}")
# attach dimension names and/or coordinates to positional indexer
if isinstance(label, Variable):
indexer = Variable(label.dims, indexer)
elif isinstance(label, DataArray):
indexer = DataArray(indexer, coords=label._coords, dims=label.dims)
return IndexSelResult({self.dim: indexer})
def isel(
self, indexers: Mapping[Any, Union[int, slice, np.ndarray, Variable]]
) -> Union["PeriodicBoundaryIndex", None]:
print("isel called")
return super().isel(indexers=indexers) This works for integer indexing with sel! lon_coord = xr.DataArray(data=np.linspace(-180, 180, 19), dims="lon")
da = xr.DataArray(data=np.random.randn(19), dims="lon", coords={"lon": lon_coord})
world = da.drop_indexes("lon").set_xindex("lon", index_cls=PeriodicBoundaryIndex)
world.sel(lon=200, method="nearest")
Yay! 🍾 Q: Best way to do this for slicing? I want this to work world.sel(lon=slice(170, 190)) Internally that means So I guess I have to turn my slice into a list of specific integer positions and pass that to I guess I also want to reorder the result before returning it, otherwise the two sides of the dateline won't be stitched together in the right order... Q: Where should I pass in If I want the period of the Q: How to avoid just copying all of I find myself copying the entire implementation of Also pointing me to looking at the implementation of I wonder if these problems could be ameliorated by providing public entry methods in the # library code
class NodeMixin:
"""Inherit from this to create your own TreeNode class with parent and children"""
def _pre_detach_children(self, children):
"""Method call before detaching `children`."""
pass
def _post_detach_children(self, children):
"""Method call after detaching `children`."""
pass
def _pre_attach_children(self, children):
"""Method call before attaching `children`."""
pass
def _post_attach_children(self, children):
"""Method call after attaching `children`."""
pass
# user code
class MyHappyTreeNode(NodeMixin):
def _pre_attach_children(self, children):
"""Celebrates the gift of children"""
print("A child is born!") What if we put similar methods on the class PandasIndex:
def _post_process_label_value(self, label_value: float) -> float:
"""Method call after determining scalar label value."""
return label_value
def sel(
self, labels: dict[Any, Any], method=None, tolerance=None
) -> IndexSelResult:
# rest of the function as before
...
if isinstance(label, slice):
indexer = _query_slice(self.index, label, coord_name, method, tolerance)
elif is_dict_like(label):
raise ValueError(
"cannot use a dict-like object for selection on "
"a dimension that does not have a MultiIndex"
)
else:
label_array = normalize_label(label, dtype=self.coord_dtype)
if label_array.ndim == 0:
label_value = as_scalar(label_array)
label_value = self._post_process_label_value(label_value) # new bit
...
# rest of the function as before Then in my case I would not have had to copy so much of the implementation, I could have simply done class PeriodicBoundaryIndex(PandasIndex):
period: float
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.period = 360 # TODO work out where this input should be passed in instead of hard-coding
def _wrap_periodically(self, label_value):
return self.index.min() + (label_value - self.index.max()) % self.period
def _post_process_label_value(self, label_value):
return self._wrap_periodically(label_value) Maybe this is a bad idea / doesn't make sense but I thought I would suggest it anyway. |
Great @TomNicholas! To avoid copying the body of class PeriodicBoundaryIndex(PandasIndex):
"""
An index representing any 1D periodic numberline.
Implementation subclasses a normal xarray PandasIndex object but intercepts indexer queries.
"""
period: float
def __init__(self, *args, period=360, **kwargs):
super().__init__(*args, **kwargs)
self.period = period
@classmethod
def from_variables(self, variables, options):
obj = super().from_variables(variables, options={})
obj.period = options.get("period", obj.period)
return obj
def _wrap_periodically(self, label_value):
return self.index.min() + (label_value - self.index.max()) % self.period
def sel(
self, labels: dict[Any, Any], method=None, tolerance=None
) -> IndexSelResult:
"""Remaps labels outside of the indexes' range back to integer indices inside the range."""
assert len(labels) == 1
coord_name, label = next(iter(labels.items()))
if isinstance(label, slice):
wrapped_label = slice(
self._wrap_periodically(label.start),
self._wrap_periodically(label.stop),
)
else:
wrapped_label = self._wrap_periodically(label)
return super().sel({coord_name: wrapped_label}) Note: I also added It should work in most cases I think: lon_coord = xr.DataArray(data=np.linspace(-180, 180, 19), dims="lon")
da = xr.DataArray(data=np.random.randn(19), dims="lon", coords={"lon": lon_coord})
# note the period set here
world = da.drop_indexes("lon").set_xindex("lon", index_cls=PeriodicBoundaryIndex, period=360) world.sel(lon=200, method="nearest")
# <xarray.DataArray ()>
# array(-0.86583185)
# Coordinates:
# lon float64 -160.0
world.sel(lon=[200, 200], method="nearest")
# <xarray.DataArray (lon: 2)>
# array([-0.86583185, -0.86583185])
# Coordinates:
# * lon (lon) float64 -160.0 -160.0
world.sel(lon=slice(180, 200), method="nearest")
# <xarray.DataArray (lon: 2)>
# array([-1.59829997, -0.86583185])
# Coordinates:
# * lon (lon) float64 -180.0 -160.0 There's likely more things to do for slices as you point out. I don't think either that it's possible to pass two slices to If we really need more flexibility in Or like you suggest we could define some |
Nice @benbovy ! That seems useable already - I'll open a PR to work on it more. It's also much neater 😄
Great, I was hoping there was some functionality like that.
That would be a shame IMO, so I'll have more of think about how to handle slicing across the dateline.
All good points. The other thing I will think about is whether anything special needs to happen for 2D+ periodicity. I suspect that for integers you could just use independent 1D indexes along each dim but for slicing across the "dateline" it might get messy in higher dimensions... |
In general, it seems like most (nearly all?) 1D indexing use cases can be handled by encapsulating a |
One could split it into two calls to isel and concatenate the result. Not sure if that's possible with the given interface. |
Okay I think this design could work for slicing across boundaries: from xarray.core.indexes import PandasIndex, IndexSelResult, _query_slice
from xarray.core.indexing import _expand_slice
class PeriodicBoundaryIndex(PandasIndex):
"""
An index representing any 1D periodic numberline.
Implementation subclasses a normal xarray PandasIndex object but intercepts indexer queries.
"""
period: float
_min: float
_max: float
__slots__ = ("index", "dim", "coord_dtype", "period", "_max", "_min")
def __init__(self, *args, period=360, **kwargs):
super().__init__(*args, **kwargs)
self.period = period
self._min = self.index.min()
self._max = self.index.max()
@classmethod
def from_variables(self, variables, options):
obj = super().from_variables(variables, options={})
obj.period = options.get("period", obj.period)
return obj
def _wrap_periodically(self, label_value: float) -> float:
"""Remaps an individual point label back to another inside the range."""
return self._min + (label_value - self._max) % self.period
def _split_slice_across_boundary(self, label: slice) -> np.ndarray:
"""
Splits a slice into two slices, one either side of the boundary,
finds the corresponding indices, concatenates them,
and returns them ready to be passed to .isel().
"""
first_slice = slice(label.start, self._max, label.step)
second_slice = slice(self._min, self._wrap_periodically(label.stop), label.step)
first_as_index_slice = _query_slice(self.index, first_slice)
second_as_index_slice = _query_slice(self.index, second_slice)
first_as_indices = _expand_slice(first_as_index_slice, self.index.size)
second_as_indices = _expand_slice(second_as_index_slice, self.index.size)
wrapped_indices = np.concatenate([first_as_indices, second_as_indices])
return wrapped_indices
def sel(
self, labels: dict[Any, Any], method=None, tolerance=None
) -> IndexSelResult:
"""Remaps labels outside of the indexes' range back to integer indices inside the range."""
assert len(labels) == 1
coord_name, label = next(iter(labels.items()))
if isinstance(label, slice):
# TODO enumerate all the possible cases
if self._min < label.start < self._max and self._min < label.stop < self._max:
# simple case of slice not crossing boundary
wrapped_label = slice(
self._wrap_periodically(label.start),
self._wrap_periodically(label.stop),
)
return super().sel({coord_name: wrapped_label})
elif self._min < label.start < self._max and label.start < self._max < label.stop:
# nasty case of slice crossing boundary
wrapped_indices = self._split_slice_across_boundary(label)
return IndexSelResult({self.dim: wrapped_indices})
else:
# TODO there are many other cases to handle...
raise NotImplementedError()
else:
# just a scalar
wrapped_label = self._wrap_periodically(label)
return super().sel({coord_name: wrapped_label}, method=method, tolerance=tolerance)
def __repr__(self) -> str:
return f"PeriodicBoundaryIndex(period={self.period})" world.sel(lon=slice(60, 120), method="nearest")
# <xarray.DataArray (lon: 4)>
# array([-0.71424378, -0.87270922, -0.9701637 , -0.99979417])
# Coordinates:
# * lon (lon) float64 60.0 80.0 100.0 120.0 This works even for slices that cross the dateline world.sel(lon=slice(160, 210), method="nearest")
# <xarray.DataArray (lon: 4)>
# array([-0.85218366, -0.68526211, 0.68526211, 0.85218366])
# Coordinates:
# * lon (lon) float64 160.0 180.0 -180.0 -160.0 This isn't general yet, there are lots of edge cases this would fail on, but I think it shows that as long as each case is captured we always could use this approach to remap back to index values that do lie within the range? What do people think? EDIT:
I believe what I've done here is the closest thing to that that is possible with the given interface. |
I think this version does something sensible for all slice cases from xarray.core.indexes import (
PandasIndex, IndexSelResult, _query_slice
)
from xarray.core.indexing import _expand_slice
class PeriodicBoundaryIndex(PandasIndex):
"""
An index representing any 1D periodic numberline.
Implementation subclasses a normal xarray PandasIndex object but intercepts indexer queries.
"""
period: float
_min: float
_max: float
__slots__ = ("index", "dim", "coord_dtype", "period", "_max", "_min")
def __init__(self, *args, period=360, **kwargs):
super().__init__(*args, **kwargs)
self.period = period
self._min = self.index.min()
self._max = self.index.max()
@classmethod
def from_variables(self, variables, options):
obj = super().from_variables(variables, options={})
obj.period = options.get("period", obj.period)
return obj
def _wrap_periodically(self, label_value: float) -> float:
return self._min + (label_value - self._max) % self.period
def _split_slice_across_boundary(self, label: slice) -> np.ndarray:
"""
Splits a slice into two slices, one either side of the boundary,
finds the corresponding indices, concatenates them, and returns them,
ready to be passed to .isel().
"""
first_slice = slice(label.start, self._max, label.step)
second_slice = slice(self._min, label.stop, label.step)
first_as_index_slice = _query_slice(self.index, first_slice)
second_as_index_slice = _query_slice(self.index, second_slice)
first_as_indices = _expand_slice(first_as_index_slice, self.index.size)
second_as_indices = _expand_slice(second_as_index_slice, self.index.size)
wrapped_indices = np.concatenate([first_as_indices, second_as_indices])
return wrapped_indices
def sel(
self, labels: dict[Any, Any], method=None, tolerance=None
) -> IndexSelResult:
"""Remaps labels outside of the indexes' range back to integer indices inside the range."""
assert len(labels) == 1
coord_name, label = next(iter(labels.items()))
if isinstance(label, slice):
start, stop, step = label.start, label.stop, label.step
if stop < start:
return super().sel({coord_name: []})
assert self._min < self._max
wrapped_start = self._wrap_periodically(label.start)
wrapped_stop = self._wrap_periodically(label.stop)
wrapped_label = slice(wrapped_start, wrapped_stop, step)
if wrapped_start < wrapped_stop:
# simple case of slice not crossing boundary
return super().sel({coord_name: wrapped_label})
else: # wrapped_stop < wrapped_start:
# nasty case of slice crossing boundary
wrapped_indices = self._split_slice_across_boundary(wrapped_label)
return IndexSelResult({self.dim: wrapped_indices})
else:
# just a scalar / array of scalars
wrapped_label = self._wrap_periodically(label)
return super().sel({coord_name: wrapped_label}, method=method, tolerance=tolerance)
def __repr__(self) -> str:
return f"PeriodicBoundaryIndex(period={self.period})" lon_coord = xr.DataArray(data=np.linspace(-180, 180, 19), dims="lon")
da = xr.DataArray(data=np.sin(180*lon_coord), dims="lon", coords={"lon": lon_coord})
world = da.drop_indexes("lon").set_xindex("lon", index_cls=PeriodicBoundaryIndex, period=360) world.sel(lon=slice(60, 120), method="nearest")
# <xarray.DataArray (lon: 4)>
# array([-0.71424378, -0.87270922, -0.9701637 , -0.99979417])
# Coordinates:
# * lon (lon) float64 60.0 80.0 100.0 120.0 world.sel(lon=slice(160, 210), method="nearest")
# <xarray.DataArray (lon: 4)>
# array([-0.85218366, -0.68526211, 0.68526211, 0.85218366])
# Coordinates:
# * lon (lon) float64 160.0 180.0 -180.0 -160.0 world.sel(lon=slice(-210, -160), method="nearest")
# <xarray.DataArray (lon: 4)>
# array([-0.85218366, -0.68526211, 0.68526211, 0.85218366])
# Coordinates:
# * lon (lon) float64 160.0 180.0 -180.0 -160.0 Unsure as to whether this next one counts as an "intuitive" result or not world.sel(lon=slice(-210, 210), method="nearest")
# <xarray.DataArray (lon: 4)>
# array([-0.85218366, -0.68526211, 0.68526211, 0.85218366])
# Coordinates:
# * lon (lon) float64 160.0 180.0 -180.0 -160.0 world.sel(lon=slice(120, 60), method="nearest")
# <xarray.DataArray (lon: 0)>
# array([], dtype=float64)
# Coordinates:
# * lon (lon) float64 |
I've created a I've extracted the boilerplate from @dcherian's
Yeah I guess it will work well with independent For multi-dimension coordinates with periodic boundaries this would probably be best handled by more specific indexes, e.g., xoak's s2point index that supports periodicity for lat/lon data (I think). |
What is your issue?
I would like to create a
PeriodicBoundaryIndex
using the Explicit Indexes refactor. I want to do it first in 1D, then 2D, then maybe ND.I'm thinking this would be useful for:
Eventually I think perhaps this index should live in xarray itself? As it's domain-agnostic, doesn't introduce extra dependencies, and could be a conceptually simple example of a custom index.
I had a first go, using the
benbovy:add-set-xindex-and-drop-indexes
branch, and reading the in-progress docs page. I got a bit stuck early on though.@benbovy here's what I have so far:
Now selecting a value with isel inside the range works fine, giving the same result same as without my custom index. (The length of the example dataset along
lon
is53
.)But indexing with a
lon
value outside the range of the index data gives anIndexError
, seemingly without consulting my new index object. It didn't even print"isel called"
😕 What should I have implemented that I didn't implement?The text was updated successfully, but these errors were encountered: