Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail to sel() when index comes from categorical pandas Series #3669

Closed
mancellin opened this issue Jan 8, 2020 · 3 comments · Fixed by #3670
Closed

Fail to sel() when index comes from categorical pandas Series #3669

mancellin opened this issue Jan 8, 2020 · 3 comments · Fixed by #3670

Comments

@mancellin
Copy link
Contributor

Dear xarray team,

Thank you very much for your work on this useful package.
Here is a bug I just found in my code.

MCVE Code Sample

Creating a Dataset from pandas when the coordinate is a categorical series:

import pandas as pd

ind = pd.Series(['foo', 'bar'], dtype='category')
df = pd.DataFrame({'ind': ind, 'values': [1, 2]})
df = df.set_index('ind')

ds = df.to_xarray()

print(ds.sel(ind='foo'))

Expected Output

When ind is not categorical, the code returns the expected output:

<xarray.Dataset>
Dimensions:  ()
Coordinates:
    ind      <U3 'foo'
Data variables:
    values   int64 1

Problem Description

When ind is categorical, it fails and gives the following traceback:

Traceback (most recent call last):
  File "/home/matthieu/foo.py", line 9, in <module>
    print(ds.sel(ind='foo'))
  File "/opt/anaconda3/envs/issue/lib/python3.8/site-packages/xarray/core/dataset.py", line 2013,
in sel
    pos_indexers, new_indexes = remap_label_indexers(
  File "/opt/anaconda3/envs/issue/lib/python3.8/site-packages/xarray/core/coordinates.py", line 39
1, in remap_label_indexers
    pos_indexers, new_indexes = indexing.remap_label_indexers(
  File "/opt/anaconda3/envs/issue/lib/python3.8/site-packages/xarray/core/indexing.py", line 260,
in remap_label_indexers
    idxr, new_idx = convert_label_indexer(index, label, dim, method, tolerance)
  File "/opt/anaconda3/envs/issue/lib/python3.8/site-packages/xarray/core/indexing.py", line 179,
in convert_label_indexer
    indexer = index.get_loc(
TypeError: get_loc() got an unexpected keyword argument 'tolerance'

Output of xr.show_versions()

commit: None python: 3.8.0 (default, Nov 6 2019, 21:49:08) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 4.19.91-1-MANJARO machine: x86_64 processor: byteorder: little LC_ALL: None LANG: fr_FR.UTF-8 LOCALE: fr_FR.UTF-8 libhdf5: None libnetcdf: None

xarray: 0.14.1
pandas: 0.25.3
numpy: 1.17.4
scipy: None
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
setuptools: 44.0.0.post20200106
pip: 19.3.1
conda: None
pytest: None
IPython: None
sphinx: None

@fujiisoup
Copy link
Member

Thanks, @mancellin

I sent a quick fix.
Please feel free to comment there.

@mancellin
Copy link
Contributor Author

Your patch fixes the issue, thank you!

FYI, found another issue with categorical values: #3674

@fujiisoup
Copy link
Member

Let's close this after #3670 is merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants