-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filesystem paths for diffusion models #32
Comments
One possibility to consider regarding option 1 is that if suffix " |
Another thought regarding option 1: With the current state of the proposal (which is option 2), there is no stringent need to separate between intrinsic and extrinsic parameters; the only difference is whether or not the data are compulsory in order to interpret the model. However if option 1 were instead adopted, one would then have to question whether it makes sense to have something like an FA parametric map stored with a file name " (P.S. "Intrinsic" vs. "Extrinsic" isn't necessarily the best nomenclature and could be changed; it's maybe something more like "model parameters" vs. "model-derived parameters") |
Will consider implementing a PR (following merge of #40) showing what option 1 might look like in reality. |
Making an attempt at reverting back to solution 1 reminded me of one of the issues faced there. With each unique model having a pre-defined suffix and set of parameters (i.e. option 2 & the current state of the draft), the data representation of each image is established in the specification, and so just a single JSON file can be used to store all metadata related to the model. With option 1, with every DWI model falling under the suffix of "
Personally I would prefer option 2. That would actually allow a specification that is agnostic to the nature of specific DWI models, and (with adequate foresight) any new DWI model could be stored in a BIDS Derivatives compliant way without ever having to modify the specification to integrate that model. The problem however is the inheritance principle. The data representation being unique for each model parameter means that there needs to be a unique JSON for each model parameter image. There is however other information that apply to the model as a whole, rather than to specific parameters; e.g. model fitting parameters, mask image, that sort of thing. Rather than duplicate this information across all model parameter JSONs, it would be preferable to instead have one JSON for the model as a whole that provides this information, and one JSON per parameter that provides information relevant to that parameter only. But that would require multiple applicable JSONs at one level of the filesystem hierarchy, which is not permitted. The way I see it, there are two possible solutions to this:
|
I originally posted about the above issue in bids-standard/bids-specification#259. I think what I need to do is first write a proposal PR for bids-standard/bids-specification#482 to reduce ambiguity about the current state of the inheritance principle, and then write a second PR proposing specific language for bids-standard/bids-specification#259, i.e. what the language would need to be in order to facilitate what I'm looking for here. |
It looks like this one is a priority. If we move on with this we are left with a single PR is that right? |
This is the primary one on which I want to get feedback from BIDS maintainers before proceeding much further. bids-standard/bids-specification#946 is ready for review; I then need to, on top of that, propose a solution for bids-standard/bids-specification#259 in order to facilitate option 1, which I think is my personal long-term preference. It was indicated in a comment here that maintainers may be open to doing so as a minor rather than major revision, though this would require dialogue with validator maintainers as well. |
With rejection of bids-standard/bids-specification#1003 and bids-standard/bids-specification#1280 and acceptance of bids-standard/bids-specification#1602, proceeding with option 1 "single file suffix" " |
This Issue addresses the fundamental question of the file names (& potentially sub-directory placement) of diffusion model derivatives should be specified.
This is highly important for DWI models since the outcomes of fitting a model to the data typically does not result in the generation of a single output file, but many files, and quite often the individual files are not very interpretable in isolation.
Single fixed file suffix
This is the strategy that was initially proposed for BEP016 before I took a major hacksaw to things in [ENH] Restructuring of DWI derivatives #5, but this particular question was not exhaustively considered due to the large sweeping changes there.
Advantage: very easy for the BIDS validator to check derivative data for conformity, since there's a single suffix across all possible diffusion models.
Disadvantage: file names do become a bit longer than for other options, since for multiple files in the directory you will have "
_model-<label> ... _diffmodel
" appearing.Unique file suffix per model
This is what is currently shown as of [ENH] Restructuring of DWI derivatives #5.
Advantage: Primary identifying data contents of files are shown at end of file basename, whereas if as per 1. above this suffix is simply "
diffmodel
" you then have to read back through the prior key-value fields to find the "_model-
" item to find out what you're looking at; fine for a machine, slightly annoying for a human.Disadvantage: Slightly harder for the validator. The current text suggests that the specified model suffixes are compulsory only if it is one of those models that is used. This means that if the validator encounters a suffix it doesn't know, it can only produce a warning, not an error. This therefore has potentially very large ramifications for BIDS derivatives more generally (assuming it is intended that such be used for BIDS derivatives).
Use filesystem hierarchy
It may be possible for a BIDS derivatives
dwi/
directory to become exceptionally full if one performs a lot of operations and generates a lot of derivative data. The ability to "group" those files corresponding to a specific model fit into a sub-directory of the modality directory would make things easier to navigate for users.Consider an example App that both performs DWI pre-processing and fits the tensor model. Currently this might look like:
The model derivatives could instead be placed into their own sub-directory. For example (noting that there are a number of possible alternatives here):
This might be beneficial in the longer term as derivatives become more widespread & complex, in order to not clog up a basic modality directory structure with large amounts of unrelated data.
As far as the validator is concerned, this would mean that it would know to give freedom to file basename suffixes where a file resides in a sub-directory of the modality directory. It could also check that the names of those sub-directories conform. E.g. Consider the following where there are multiple tensor fits performed using different methods:
The sub-directories here should not I think include "
sub-<participant_label>
", but should at least confirm to the basic key-value structure with a final identifier (please let me know if the BIDS community has settled on nomenclature for such things so that we can speak the same language).The text was updated successfully, but these errors were encountered: