-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify usage of "dot notation" for sub-objects #841
Comments
{
"Name": "Human Connectome Project",
"BIDSVersion": "1.3.0",
"License": "CC0",
"Authors": ["1st author", "2nd author"],
"Funding": "P41 EB015894/EB/NIBIB NIH HHS/United States",
"Genetics": {
"Dataset": "https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001364.v1.p1",
"Database": "https://www.ncbi.nlm.nih.gov/gap/",
"Descriptors": "doi:10.1016/j.neuroimage.2013.05.041"
}
"DatasetDOI": "doi:myimagaing data"
} could be changed without subfields if that's the direction we are taking ; whatever is easier {
"Name": "Human Connectome Project",
"BIDSVersion": "1.3.0",
"License": "CC0",
"Authors": ["1st author", "2nd author"],
"Funding": "P41 EB015894/EB/NIBIB NIH HHS/United States",
"GeneticsDataset": "https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001364.v1.p1",
"GeneticsDatabase": "https://www.ncbi.nlm.nih.gov/gap/",
"GeneticsDescriptors": "doi:10.1016/j.neuroimage.2013.05.041"
"DatasetDOI": "doi:myimagaing data"
} |
That would be ideal but would break backward compatibility, no? |
I think we could use something like the following for bids-specification/src/schema/metadata/iEEGCoordinateSystem.yaml Lines 1 to 19 in 9a79dff
Then We'll need to update the rendering code to render sub-objects in tables though. |
Thanks for the pointer: I will give this a try. |
Hum... There is a 🤦OK will only get rid of the dot notation then. |
Genetics
subobject in the schemaGenetics
subobject in the schema
Genetics
subobject in the schemaGenetics
subobject in the schema
I actually like the dot notation because it shows that the item after the dot is a member of the object before the dot 🤔 similar to how you refer to attributes in Python objects. |
I kind of agree though we would have to clearly formalize somewhere in the schema that this is how sub-objects should be "encoded". FYI: I raised the issue because given how those yml files are named at the moment, I had BIDS-matlab complain because it was trying to create a structure with a fieldname |
I see 🤔 so we need to:
(in order for downstream packages like bids-matlab to reliably work with this) Let's also ask @effigies |
to me this seems like a big decision and has much wider ramifications. in particular it takes json (and json schema), which is perfectly capable of nested structures and arrays and imposes a key convention that every tool has to implement. personally i think nesting should be left, with yaml describing "Genetics" in this case as type object. i think this specific issue is more than just the flattening of things. does the descriptor allow for multiple genetic datasets? the paper that it references does. and different omic data could sit in different databases. further this is also about consistency of terms. a genetics dataset is perhaps no different from a bids dataset, so having the terms inside be aligned with terms at the bids dataset level will allow reuse of the same terminology that's in bids without having to create a prefix (which is effectively what the dot or underscore notation is doing). |
Genetics
subobject in the schemaGenetics
subobject in the schema?
😆 In my defense I was traveling yesterday, so I was operating on like 50% mental capacity.
When you say "encoded", do you mean how they're shown in the specification or how they're actually specified in metadata? Like making If it's just a matter of rendering, then I think we just need a |
+1 on all of @satra comments, we chose attributes of the genetics object (indeed like python @sappelhoff) with multiple omics possible indeed |
I would also expect the spec to formalize the semantics of the dot notation for tree like structures.
|
OK we talked with the maintainers about this. We agreed that this is "mostly" a rendering issue at the moment. So we'll keep the metadata yml filename and metadata name as they were. For example:
The idea is to create a new macro to render sub-objects of a an object in a separate table without the dot notation. Ideally we would want to formalize that this is how subobjects metadata should be encoded in the schema and systematize this rendering in the spec. |
Genetics
subobject in the schema?
FWI our reasoning was that genetics was a single 'object' added to the BIDS descriptor so any things related to that would be 'attributes' to that object - maybe that a good enough approach to define how to use dot notation ... |
yup makes sense. |
Once #919 is merged and we implement a macro for generating tables for individual objects, I think we will be able to get rid of the |
The
genetics
object in dataset_description has 3 sub-objects:Dataset
Database
Descriptors
https://bids-specification.readthedocs.io/en/stable/04-modality-specific-files/08-genetic-descriptor.html#dataset-description
Currently the schema has a single yaml file for each sub-object but there is no yaml file for theGenetics
.Moreover the nested objects is not captured by the schema.
And the way the sub-objects are mentioned in the rendred table involve a dot notation
Genetics.Dataset
that appears nowhere else in the spec.Suggestion
- create yaml file forGenetics
Possible issues
Not sure how to formalize the nested structure in the schema itself:
Should the sub-objects ofGenetics
be defined in the yaml ofGenetics
?Should we keep the yaml files for each sub-object, mention in the README that this dot notation means they are sub-objects?I don't think it makes sense to have pieces of metadata just called
Dataset
,Database
,Descriptors
when they only relate toGenetics
.Tagging @tsalo and @CPernet for ideas and feedback on this.
The text was updated successfully, but these errors were encountered: