-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aesara as an array backend in Xarray #7515
Comments
@jhamman Yes, I think this is an interesting discussion to have. We actually want to add named dimensions and coordinates to PyTensor (a fork of aesara): https://github.com/pymc-devs/pytensor, adding this capability was one of the reasons for forking, as PyMC already has some support built-in (coords and dims) and uses xarray in various places. Tighter integration between the two is a great goal. |
Hi, I'm the creator and maintainer of Aesara and we are interested in discussing named dimensions and arrays. We don't know why @twiecki and his PyMC Labs group have been attempting to implicitly speak for us on this—or any other—matter. As you can see, the original issue in Aesara is still open, and we have not said that the feature cannot be implemented in some way. |
Author of the little diagram and core member of Aesara here 🙂 If you want to know a little more about where the project is headed you can consult the project’s Mission statement. Happy to chat, you can find my contact info on my GitHub profile, and/or join our new Discord server. |
It seems like there are at least 3 separate topics being discussed here.
|
(I was already typing this when Ryan posted so I'll finish anyway 😅) To clarify, what @jhamman is suggesting in this specific issue is xarray wrapping aesara, as opposed to aesara wrapping xarray. Both of these goals would be great, but in xarray we would particularly love to be able to wrap aesara because:
For xarray to wrap aesara aesara needs to provide a numpy-like API, ideally conforming to the python array api standard. If aesara already does this then we should try out the wrapping right now! If you're interested in this topic I invite you to drop in to a meeting of the Pangeo working group on distributed arrays! We have so far had talks from distributed computing libraries including Arkouda, Ramba, and cubed, all of which we are hoping to support as compute backends. If anyone is also separately interested in using xarray inside PyTensor / aesara then that's awesome, but we should try to track efforts in that direction on a different issue to keep this distinction clear. We plan to better support that direction of wrapping soon by fully exposing our (semi-private internal currently) lightweight |
Thanks all for the discussion. Welcome @brandonwillard, @twiecki, and @rlouf to the Xarray project. And thanks to @rabernat and @TomNicholas for helping orient the conversation. As a next step, I think it would be fun to try putting an Aesara array into Xarray and see how it goes. In our experience, this process inevitably brings up a few issues where interaction between Xarray and Aesara developers is fruitful. |
Aesara has a NumPy-like API, and we always planned to conform to the python array api standard (see aesara-devs/aesara#729 for instance). This would be the impetus for us to do it now. |
Here's a small example to get things started: x = aesara.shared(np.random.standard_normal((3, 4)))
xda = xr.DataArray(x, dims=('x', 'y')) this currently returns the following error: ---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[138], line 2
1 x = aesara.shared(np.random.standard_normal((3, 4)))
----> 2 xda = xr.DataArray(x, dims=('x', 'y'))
File ~/miniforge3/envs/demo-env/lib/python3.10/site-packages/xarray/core/dataarray.py:428, in DataArray.__init__(self, data, coords, dims, name, attrs, indexes, fastpath)
426 data = _check_data_shape(data, coords, dims)
427 data = as_compatible_data(data)
--> 428 coords, dims = _infer_coords_and_dims(data.shape, coords, dims)
429 variable = Variable(dims, data, attrs, fastpath=True)
430 indexes, coords = _create_indexes_from_coords(coords)
File ~/miniforge3/envs/demo-env/lib/python3.10/site-packages/xarray/core/dataarray.py:142, in _infer_coords_and_dims(shape, coords, dims)
140 dims = tuple(dims)
141 elif len(dims) != len(shape):
--> 142 raise ValueError(
143 "different number of dimensions on data "
144 f"and dims: {len(shape)} vs {len(dims)}"
145 )
146 else:
147 for d in dims:
ValueError: different number of dimensions on data and dims: 0 vs 2 This tells me there is a bit of work to do at the core of Aesara's numpy compatibility. Xarray will make frequent references to attributes like |
We'll fix the compatibility issues, but we first need to understand what the expectations on something like For a shared variable, it's always possible to get a value for |
At a minimum, xarray expects
This sounds a bit similar to discussions we have been having about wrapping ragged arrays in xarray, for which there are multiple ways you might choose to define the shape. The simplest way to guarantee that aesara can be wrapped by xarray is if aesara conformed to the array API standard, and they have a test suite you can use to check that. We are also working on our own testing framework that duck-typed array libraries like aesara could import to quickly test integration with xarray. |
Just wanted to drop in and remind people interested in this that we hold a bi-weekly pangeo working group for distributed array computing, which is the perfect place to come and ask about any questions over zoom! I'll be there at 1pm EST today. |
Is your feature request related to a problem?
I recently learned about a meta-tensor library called Aesara which got me wondering if it would be a good array backend for Xarray.
xref: aesara-devs/aesara#352, @OriolAbril, @twiecki
Has anyone looked into this yet?
The text was updated successfully, but these errors were encountered: