Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transpose some but not all dimensions #1081

Closed
Yefee opened this issue Nov 4, 2016 · 17 comments
Closed

Transpose some but not all dimensions #1081

Yefee opened this issue Nov 4, 2016 · 17 comments

Comments

@Yefee
Copy link

Yefee commented Nov 4, 2016

Hi, all

Sorry to bother. Maybe it is a kind of stupid question for others, but I cannot figure it out at this moment.

I want to swap dims in xarray, like swapaxes in numpy. I found both dataarray and dataset has method swap_dims, but I don't understand its arguments: dims_dict : dict-like Dictionary whose keys are current dimension names and whose values are new names. Each value must already be a coordinate on this array.

Here is my example:

data = np.random.rand(4,3)
lon = [1,2,3]
lat = [4,3,2,1]
foo = xr.DataArray(data,coords=[lat,lon])
foo
foo = xr.DataArray(data,coords=[lat,lon],dims=['lat','lon'])
foo
foo.swap_dims({'lat':'lon'})

The error message:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-47-c8aa4311b27e> in <module>()
----> 1 foo.swap_dims({'lat':'lon'})

/glade/u/home/che43/miniconda2/lib/python2.7/site-packages/xarray/core/dataarray.pyc in swap_dims(self, dims_dict)
    794         Dataset.swap_dims
    795         """
--> 796         ds = self._to_temp_dataset().swap_dims(dims_dict)
    797         return self._from_temp_dataset(ds)
    798

/glade/u/home/che43/miniconda2/lib/python2.7/site-packages/xarray/core/dataset.pyc in swap_dims(self, dims_dict, inplace)
   1293                 raise ValueError('replacement dimension %r is not a 1D '
   1294                                  'variable along the old dimension %r'
-> 1295                                  % (v, k))
   1296
   1297         result_dims = set(dims_dict.get(dim, dim) for dim in self.dims)

ValueError: replacement dimension 'lon' is not a 1D variable along the old dimension 'lat'

Sorry to bother.

@shoyer
Copy link
Member

shoyer commented Nov 4, 2016

swap_dims does something very different from swap_axes in numpy (we should add an example to make this clear).

For what you want, I think transpose is a closer fit, e.g., foo.transpose('lon', 'lat')

@Yefee
Copy link
Author

Yefee commented Nov 4, 2016

Thanks, @shoyer

I agree your methods works for 2d matrix, but for 3 or 4d matrix it fails.

@shoyer
Copy link
Member

shoyer commented Nov 4, 2016

You need to provide the full list of dimensions to transpose.

We could add a method like numpy's swap_axes, it's just not clear what to name it.

@Yefee
Copy link
Author

Yefee commented Nov 4, 2016

Thanks! I will check it.

@rabernat
Copy link
Contributor

rabernat commented Nov 4, 2016

I have hit this issue before too.

We could add a method like numpy's swap_axes, it's just not clear what to name it.

reorder_dims?

@shoyer
Copy link
Member

shoyer commented Nov 4, 2016

reorder_dims?

Would that be consistent with reorder_levels for MultIindex (#1028)? I'm not sure if that handles partial order specifications or not.

@rafa-guedes
Copy link
Contributor

rafa-guedes commented Aug 21, 2017

I have also hit this issue, this method could be useful. I'm putting below my workaround in case it is any helpful:

def reorder_dims(darray, dim1, dim2):
    """
    Interchange two dimensions of a DataArray in a similar way as numpy's swap_axes
    """
    dims = list(darray.dims)
    assert set([dim1,dim2]).issubset(dims), 'dim1 and dim2 must be existing dimensions in darray'
    ind1, ind2 = dims.index(dim1), dims.index(dim2)
    dims[ind2], dims[ind1] = dims[ind1], dims[ind2]
    return darray.transpose(*dims)

@shoyer
Copy link
Member

shoyer commented Nov 8, 2017

What about allowing .transpose() to handle a subset of array/dataset dimensions? In NumPy, this may not be desirable because it's easy to mix up integer dimensions, but in xarray ds.transpose('lat', 'lon') seems pretty unambiguous.

The implementation would simply reorder all the listed dimensions, keeping other dimensions in their original order.

@shoyer shoyer changed the title How to swap dims? Transpose some but not all dimensions Nov 8, 2017
@max-sixty
Copy link
Collaborator

ds.transpose('lat', 'lon') seems pretty unambiguous.

Though I think that would have radically different behavior for a 2-dim or 3-dim case. For the 2-dim case, it would enforce that order regardless of original order. For the 3-dim case, are you proposing they're swapped from their current order?

(Maybe transpose naturally refers to the behavior I think you describe, we'd need something else to 'set this order')

@shoyer
Copy link
Member

shoyer commented Nov 8, 2017

tranpose('x', 'y') already means ensure this object has dimensions in the order (x, y):

In [2]: a = xarray.DataArray([[0]], dims=['x', 'y'])

In [3]: a.T
Out[3]:
<xarray.DataArray (y: 1, x: 1)>
array([[0]])
Dimensions without coordinates: y, x

In [4]: a.T.transpose('x', 'y')
Out[4]:
<xarray.DataArray (x: 1, y: 1)>
array([[0]])
Dimensions without coordinates: x, y

In [5]: a.transpose('x', 'y')
Out[5]:
<xarray.DataArray (x: 1, y: 1)>
array([[0]])
Dimensions without coordinates: x, y

@stale
Copy link

stale bot commented Oct 9, 2019

In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity

If this issue remains relevant, please comment here or remove the stale label; otherwise it will be marked as closed automatically

@stale stale bot added the stale label Oct 9, 2019
@shoyer shoyer removed the stale label Oct 9, 2019
@crusaderky
Copy link
Contributor

crusaderky commented Oct 10, 2019

From personal experience I find that 99% of the time, I want to push some known dimensions either to the front or to the back of the array while I don't care about the order of the others.
I'd love to have this syntax:

transpose(..., "x", "y")

or

transpose("x", "y", ...)

where the ellipsis expands to all dimensions not explicitly listed, in their original order. There can be at most one ellipsis.

@shoyer
Copy link
Member

shoyer commented Oct 10, 2019 via email

@shoyer
Copy link
Member

shoyer commented Oct 21, 2019

There's one edge case that might be worth thinking carefully about here:
Consider a dataset with two variables with dimensions ('w', 'x', 'y', 'z') and ('x', 'w', 'y', 'z'). Now we write .transpose(..., 'z', 'y'). What should the dimensions of variables on the resulting dataset be?

  1. Both ('w', 'x', 'z', 'y'), with ... filled in based on the order of dimensions in the overall dataset.
  2. ('w', 'x', 'y', 'z') and ('x', 'w', 'y', 'z'), with ... filled in for each variable separately.

@max-sixty
Copy link
Collaborator

I would vote for (2), given it's fairly easy to replicate (1) by passing the full list, and I think (2) is arguably slightly more expected

(NB this isn't how #3421 works now, but easy to change)

@shoyer
Copy link
Member

shoyer commented Oct 21, 2019

I agree, I think (2) is what most users would expect.

@crusaderky
Copy link
Contributor

crusaderky commented Oct 21, 2019

+1 for (2). Although user code that uses ... should not, by definition, care about the order of the variables that are not listed explicitly.

@crusaderky crusaderky mentioned this issue Oct 22, 2019
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants