Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variable length array #85

Open
seb5g opened this issue Apr 20, 2020 · 2 comments
Open

Variable length array #85

seb5g opened this issue Apr 20, 2020 · 2 comments
Assignees

Comments

@seb5g
Copy link

seb5g commented Apr 20, 2020

I'm creating a backend to use transparently pytables, h5py and h5pyd. I ran my test suite on h5pyd and am confronted with the issue of the variable length array. In h5py, they manage to do it using a "special" dtype: h5py.string_dtype or h5py.vlen_dtype . After some digging in I found in h5pyd, the function special_dtype where the docstring seems promising:

    vlen = basetype
        Base type for HDF5 variable-length datatype. This can be Python
        str type or instance of np.dtype.
        Example: special_dtype( vlen=str )

however after trying it out it seems its working only in the case where vlen=str and not any numpy dtype. Using a special dtype of np.uint32s I could create a dataset but when trying to access a given element I got this traceback:
File "<ipython-input-114-66578725db5f>", line 1, in <module> dset[0] File "C:\Miniconda3\envs\pymodaq_dev\lib\site-packages\h5pyd\_hl\dataset.py", line 802, in __getitem__ arr1d = bytesToArray(rsp, mtype, page_mshape) File "C:\Miniconda3\envs\pymodaq_dev\lib\site-packages\h5pyd\_hl\base.py", line 503, in bytesToArray offset = readElement(data, offset, arr, index, dt) File "C:\Miniconda3\envs\pymodaq_dev\lib\site-packages\h5pyd\_hl\base.py", line 467, in readElement arr[index] = vlen(0) TypeError: 'numpy.dtype' object is not callable

Then a bit further in the code I found :

def check_dtype(**kwds):
""" Check a dtype for h5py special type "hint" information. Only one
keyword may be given.

vlen = dtype
    If the dtype represents an HDF5 vlen, returns the Python base class.
    Currently only builting string vlens (str) are supported.  Returns
    None if the dtype does not represent an HDF5 vlen.

So the question is: is it or will it be possible to use any numpy dtype for variable length arrays in h5pyd?

Thx

@seb5g
Copy link
Author

seb5g commented Apr 20, 2020

After some more reading, your special_type function is same as in the older h5py API (that is before version h5py 2.9). Well that is just different names for same functionality except that in h5pyd, numpy special types are not working...yet?

@jreadey
Copy link
Member

jreadey commented Jan 17, 2023

Hey - sorry somehow I missed this issue till now...

You can use h5pyd.special_dtype with numpy types like this example: https://github.com/HDFGroup/h5pyd/blob/master/test/hl/test_vlentype.py#L50.

There's also support for the new api: vlen_dtype as decribed here: ,h5py/h5py#1132.
E.g.: https://github.com/HDFGroup/h5pyd/blob/master/test/hl/test_dataset.py#L1640.

The only special type missing is for regionrefs - which hopefully will get added soon.

I'll leave this issue open as a reminder to remove the old-style check_dtype, special_dtype functions since they are not in h5py anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants