fix load_bulk_data #13

alondmnt · 2023-12-31T14:18:50Z

annoying little bugs. see commits.

test:

pl = PhenoLoader('rna_seq')
pl.load_bulk_data('batch_metadata_parquet')

alondmnt · 2023-12-31T15:07:48Z

we may also face another potential issue (haven't dived into it, so it's just a hunch).

I think that get_function_for_field_type, when calling load_bulk_data on bulk dictionary fields, does not use the load_func of the parent_dataframe.

for instance, if I ask for a microbe it uses the default read_parquet, but not what is specified for group bulk data.
in the future, if I will ask for a field in a DICOM file, it will not use the load_func for DICOM files, but rather the default read_parquet.

I think...

MariaGorodetski · 2023-12-31T19:09:58Z

Super! - on the fix and the example.
Regarding the problem, I understand.
We use the field_type of the specific field in the bulk dictionary. I need to think about an elegant solution...

alondmnt added 2 commits December 31, 2023 14:14

fix: handle unspecified load_func in dict properties

9fc308e

fix: handle bulk with no common index with main table

4683d70

alondmnt requested a review from MariaGorodetski December 31, 2023 14:19

fix: load_func error handling

54146dd

MariaGorodetski approved these changes Dec 31, 2023

View reviewed changes

alondmnt merged commit 30a6831 into dev_v12 Jan 1, 2024

alondmnt deleted the dev-fix-load_bulk branch January 21, 2024 07:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix load_bulk_data #13

fix load_bulk_data #13

alondmnt commented Dec 31, 2023

alondmnt commented Dec 31, 2023

MariaGorodetski commented Dec 31, 2023

fix load_bulk_data #13

fix load_bulk_data #13

Conversation

alondmnt commented Dec 31, 2023

alondmnt commented Dec 31, 2023

MariaGorodetski commented Dec 31, 2023