Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix load_bulk_data #13

Merged
merged 3 commits into from
Jan 1, 2024
Merged

fix load_bulk_data #13

merged 3 commits into from
Jan 1, 2024

Conversation

alondmnt
Copy link
Contributor

annoying little bugs. see commits.

test:

pl = PhenoLoader('rna_seq')
pl.load_bulk_data('batch_metadata_parquet')

@alondmnt
Copy link
Contributor Author

we may also face another potential issue (haven't dived into it, so it's just a hunch).

I think that get_function_for_field_type, when calling load_bulk_data on bulk dictionary fields, does not use the load_func of the parent_dataframe.

for instance, if I ask for a microbe it uses the default read_parquet, but not what is specified for group bulk data.
in the future, if I will ask for a field in a DICOM file, it will not use the load_func for DICOM files, but rather the default read_parquet.

I think...

@MariaGorodetski
Copy link
Contributor

Super! - on the fix and the example.
Regarding the problem, I understand.
We use the field_type of the specific field in the bulk dictionary. I need to think about an elegant solution...

@alondmnt alondmnt merged commit 30a6831 into dev_v12 Jan 1, 2024
@alondmnt alondmnt deleted the dev-fix-load_bulk branch January 21, 2024 07:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants