You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been meditating about how bidscoin and sovabids work and one of the ideas is collecting information from the files (bidscoin does it from the DICOM header) and returning that information as "attributes". Then one populates the bids information with those attributes collected. See the following example:
In our case I mainly extract information from the path since usually (correct me if im wrong) the headers of eeg files do not contain such information. In any case If we generalize the idea from extracting information from a given source then we could implement a rules (or template/configuration file) that has two main parts:
heuristics
bids
The heuristics part would set up how the information is collected from the sources. A single heuristic in the heuristics part corresponds to a function in heuristics.py . These functions should return dictionaries.
heuristics:
h0: # arbitrary name to refer to a particular instance of a heuristic (needed to refer to its results later)heuristic: # name of the python function (inside heuristics.py) that executes the heuristic, it must return a dictionaryargs: # a dictionary with the arguments of the functionarg1:
arg2:
Then , whenever we need the result of a heuristic we can call it with <nameOfHeuristic.fieldOfTheReturn>. In example
bids:
subject: <h0.subject>
For tabular data the semantics could be:
bids:
channels:
0:
name: <h0.someRow.someColumn>
Here is how a more complete example would look like, along with some ideas and thoughts regarding technical details of this approach:
heuristics :
h0:
heuristic: pattern_from_example # example-based inferenceargs:
source: source example pathtarget: target example pathh1: # each heuristic has an arbitrary name by the user (obviusly must be unique)heuristic : from_regex_pattern #name of the functions in heuristics.pyargs :
pattern: some regex patternfields:
- task
- subject
- session
- grouph1b: # heuristics could take as input the outputs of another heuristic# this feature would be related to the idea of example-based inference, after all those would be heuristics tooheuristic : from_regex_pattern args :
pattern: <h0.pattern>fields: <h0.fields> h2:
heuristic: from_placeholder_patternargs :
pattern: some placeholder patternh3:
heuristic: from_tabularargs:
file: some csv file for examplesplit: ,h4:
heuristic: from_dictionaryargs:
file: some json or yaml file #extracts as a dictionaryh5:
heuristic: from_tabularargs:
file: some tsv filesplit: \tbids:
entities : # or path, what is better? After all the entities define the path. session : <h1.session> task : <h1.task>acquisition :
run :
subject: <h1.subject># Two options to refer to electrodes/channels . Either with the original name in the raw file, or the index position in the raw file. Each has it pros and cons.electrodes : # configures electrodes.tsv
- nameOfElectrodeInRawFile : # Referring to the electrode by the name# renaming can be implemented here but it may add some technical complications if we use the name itself to identify # or index the electrodes. should we drop support for that specific functionality? # A power user could just use the code_execution feature if he needs to renamename : <h3.somerow.somecolumn> # New name given (renaming)#should somerow somecolumn be the indexes or the name/key of the value ,ie h3.FCz.XCoordinate)x : <h5.somerow.somecolumn> + <h5.somerow.somecolumn> #should we include basic operations?channels:
#maybe drop support for renaming since the name works as an index, and keep the possibility of retyping
- $indexinRawFile : #Referring the electrode by its position in the raw file. Assume channels are not reordered at any point name : <h3.somerow.somecolumn>sidecar:
PowerLineFrequency : <h4.line_freq># Setting up the participants tsv columns from another fileparticipants:
group: <h1.group> # or maybe <htable.<h1.subject>> (h1 heuristic ouput serves as an index for the table)
So I was wondering if this what something worth exploring for the community. @civier
Comments regarding this approach
Idea: Maybe develop an API standard for heuristics. The input args could vary but the output should be a dictionary-like return that allows retrieval of tabular data, dictionary data, and single pure data types.
Warning: If the mne inferred info collides with what is got from the rules file (actually a better name would be template now) then sovabids must have a way to know that he needs to do changes to what mne bids wrote. Potential problems may arise with info encoded in the eeg file written by mne-bids (the information inside vhdr,vmrk,eeg,edf,bdf,set,fdt files specifically).
Idea: Maybe it would be better to use $something$ as enclose instead of <> , it may be easier to parse.
Warning: The channel types count (EEGChannelCount, and similars) in the sidecar json should probably be removed or handled since if the user retypes then the counts have have to be updated.
A good thing about this design is that it is extendable
An "execute heuristic" would be needed whenever <heuristic.somethings> is found. Now, whether the heuristic is executed everytime it is called or if we keep in memory and old result is in question. One approach is to add a "return" key to the dictionary of heuristics in memory. If that key does not exist we run the heuristic, else we just check the results.
To think about: This covers having information from a single file that applies to all subjects. What if each subject has it own metadata file? How could we implement such idea?
One way to do the previous is allowing a heuristic output to become another heuristic input. Essentially inferring the location of a metadata file describing a single eeg file would be a heuristic whose output goes into another heuristic that reads the metadata file. This may be to complicated for users though.
MNE-BIDS and mappings
Should mne/mne-bids themselves be heuristics? If we concentrate ourselves in the metadata rather than the actual file we could give the special possibility of inferring channels.tsv and the sidecar json (and other files) from mne/mne-bids. Another way is finding all the stuff mne-bids did (writing to a temporal isolated directory for example and reading all metadata files it did, printing them on the mappings) and putting that in the mapping to have the transparent info there. It is ugly but easier to maintain than knowing all of the mne-bids logic.
In general the challenge is to identify everything mne-bids did in the output and find a way to encode that info in the mappings. Problem is when both mne-bids and the user write to the same info.
References to inspire us
Bidscoin template file uses attributes inferred from headers as sources of bids information. It does not generalize though to more sources of information (miscellaneous metadata files). Sovabids currently does a workaround to solve this, it returns information inferred from mne as if they were attributes of a header .
I partially introduced a way to operate different fields from the path analysis in a feature called "operation". It is described in the Rules File Schema
I have been meditating about how bidscoin and sovabids work and one of the ideas is collecting information from the files (bidscoin does it from the DICOM header) and returning that information as "attributes". Then one populates the bids information with those attributes collected. See the following example:
In our case I mainly extract information from the path since usually (correct me if im wrong) the headers of eeg files do not contain such information. In any case If we generalize the idea from extracting information from a given source then we could implement a rules (or template/configuration file) that has two main parts:
The heuristics part would set up how the information is collected from the sources. A single heuristic in the heuristics part corresponds to a function in heuristics.py . These functions should return dictionaries.
Then , whenever we need the result of a heuristic we can call it with <nameOfHeuristic.fieldOfTheReturn>. In example
For tabular data the semantics could be:
Here is how a more complete example would look like, along with some ideas and thoughts regarding technical details of this approach:
So I was wondering if this what something worth exploring for the community. @civier
Comments regarding this approach
MNE-BIDS and mappings
References to inspire us
Bidscoin template file uses attributes inferred from headers as sources of bids information. It does not generalize though to more sources of information (miscellaneous metadata files). Sovabids currently does a workaround to solve this, it returns information inferred from mne as if they were attributes of a header .
The ARTEMIS extension could be implemented as a function in heuristics.py (see Supporting the ARTEM-IS standard #12 )
The idea of extracting info from arbitrary files can be seen in @civier original configuration file proposal:
Oren's Proposal
General configuration file format
Example of one row in the table:
The text was updated successfully, but these errors were encountered: