Have "unstack" return a boolean mask? #3518

Hoeze · 2019-11-13T13:54:49Z

MCVE Code Sample

arr = xr.DataArray(np.arange(6).reshape(2, 3),
                  coords=[('x', ['a', 'b']), ('y', [0, 1, 2])])
arr
stacked = arr.stack(z=('x', 'y'))
stacked[:4].unstack().dtype

Expected Output

>>> arr = xr.DataArray(np.arange(6).reshape(2, 3),
...                  coords=[('x', ['a', 'b']), ('y', [0, 1, 2])])
>>> arr
<xarray.DataArray (x: 2, y: 3)>
array([[0, 1, 2],
       [3, 4, 5]])
Coordinates:
  * x        (x) <U1 'a' 'b'
  * y        (y) int64 0 1 2
>>> stacked = arr.stack(z=('x', 'y'))
>>> stacked[:4].unstack().dtype
dtype('float64')

Problem Description

Unstacking changes the data type to float for NaN's.
Are there thoughts on alternative options, e.g. fill_value=0 or return_boolean_mask, in order to retain the original data type?

Currently, I obtain a boolean missing array by checking for isnan.
Then I call fillnan(0) and convert the data type back to integer.
However, this is quite inefficient.

Output of `xr.show_versions()`

INSTALLED VERSIONS ------------------ commit: None python: 3.7.4 (default, Aug 13 2019, 20:35:49) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-957.10.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.1

xarray: 0.14.0
pandas: 0.25.1
numpy: 1.17.2
scipy: 1.3.1
netCDF4: 1.4.2
pydap: None
h5netcdf: 0.7.4
h5py: 2.9.0
Nio: None
zarr: 2.3.2
cftime: 1.0.3.4
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2.5.2
distributed: 2.5.2
matplotlib: 3.1.1
cartopy: None
seaborn: 0.9.0
numbagg: None
setuptools: 41.4.0
pip: 19.2.3
conda: None
pytest: 5.0.1
IPython: 7.8.0
sphinx: None

The text was updated successfully, but these errors were encountered:

shoyer · 2019-11-13T14:46:58Z

We should definitely have a fill_value option here, and ideally a sparse option, too.

Conceptually unstack is very similar to from_dataframe.

fujiisoup mentioned this issue Nov 16, 2019

Added fill_value for unstack #3541

Merged

4 tasks

fujiisoup closed this as completed in #3541 Nov 16, 2019

fujiisoup mentioned this issue Nov 16, 2019

sparse option to reindex and unstack #3542

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Have "unstack" return a boolean mask? #3518

Have "unstack" return a boolean mask? #3518

Hoeze commented Nov 13, 2019

shoyer commented Nov 13, 2019

Have "unstack" return a boolean mask? #3518

Have "unstack" return a boolean mask? #3518

Comments

Hoeze commented Nov 13, 2019

MCVE Code Sample

Expected Output

Problem Description

Output of xr.show_versions()

shoyer commented Nov 13, 2019

Output of `xr.show_versions()`