Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for memoryview-safe variable length arrays? #673

Open
shz9 opened this issue Dec 14, 2020 · 0 comments
Open

Support for memoryview-safe variable length arrays? #673

shz9 opened this issue Dec 14, 2020 · 0 comments

Comments

@shz9
Copy link

shz9 commented Dec 14, 2020

Hi there,

I'm using Zarr to store ragged arrays in a fashion that's similar to what's outlined in the documentation:

import numcodecs, zarr, numpy as np
z = zarr.empty(4, dtype=object, object_codec=numcodecs.VLenArray(int))
z[0] = np.array([1, 3, 5])
z[1] = np.array([4])
z[2] = np.array([7, 9, 14])

In my case, I need to retrieve those arrays and then process them in a cython function:

cpdef process_ragged_arrays(int[:] r_arr):
    ...

for i in range(z.shape[0]):
    process_ragged_arrays(z[i])

However, here I get the following error message: ValueError: buffer source array is read-only. This error has already been discussed and tackled elsewhere (e.g. Dask#1978, scikit-allel#208), typically by running the array through a function like this (h/t @alimanfoo):

def memoryview_safe(x):
    """Make array safe to run in a Cython memoryview-based kernel. These
    kernels typically break down with the error ``ValueError: buffer source
    array is read-only`` when running in dask distributed.
    See Also
    --------
    https://github.com/dask/distributed/issues/1978
    https://github.com/cggh/scikit-allel/issues/206
    """
    if not x.flags.writeable:
        if not x.flags.owndata:
            x = x.copy(order='A')
        x.setflags(write=True)
    return x

My question is: Is it possible to make ragged arrays memoryview-safe natively? I can definitely run memoryview_safe on each array I retrieve, but it will incur an overhead that I would like to avoid in my program.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant