-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for nullable bool, int in dataframes #504
Comments
Mildly complicated by:
I like the third option since it's backend neutral, and doesn't require doing anything fancy with hdf5 or zarr. I suspect this issue will come up for nullable integer types as well. Maybe strings? |
Option 3 would basically be a masked array? |
Structurally, I think so, but concepts like assignment differ. I don't think we'd actually use that module. |
BooleanDtype
BooleanDtype
(support for nullable bool, int, str)
BooleanDtype
(support for nullable bool, int, str)
I am wondering what's the progress on this issue. It is very annoying when analysis results don't get saved after several hours of work on HPC because a new column popped up with unsave-able object type (in a script that worked just fine the other day, e.g. no need to test for save-ability). So I would really appreciate if this is addressed. Maybe you can do a temporary workaround that converts such objects to strings with a warning? |
Hello, `TypeError: Can't implicitly convert non-string objects to strings Above error raised while writing key 'mt-0' of <class 'h5py._hl.group.Group'> from /. Above error raised while writing key 'var' of <class 'h5py._hl.files.File'> from /.` When I concatenate without this join parameter I can write/save with no problems. |
I generally convert all problematic variables to strings |
Thank you, I'll try this. |
Specifically for nullable values, there should be a release candidate out before the holidays.
I think this will be a possibility for the foreseeable future. It's just the nature of arrays interacting with pythons object system. |
I'm going to make strings it's own issue, and close this since the |
Thanks for the suggestion. In fact, |
What needs to happen
Support for nullable dtypes during IO. Allow for writing pandas string, integer, and boolean arrays (which can have null values) by saving a "null" mask along with them.
Example
Full traceback
I have a report from the wild of writing working here, but reading (by cellxgene) failing.
The text was updated successfully, but these errors were encountered: