-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-standard _part is used within src/99-appendices/06-meg-file-formats.md example #429
Comments
ping @teonbrooks I think he might know the origins / historical background. Agree that it should be clearly described and was perhaps included a bit too hastily. But MNE now uses this convention, so changing would not be as simple as editing the specification. |
in current mne-python a file called
'split_raw_meg.fif'
would be split in
'split_raw_part-01_meg.fif'
'split_raw_part-02_meg.fif'
we must support such split files as otherwise certain MEG datasets cannot
be shared.
what do you suggest?
… |
there's a section in the appendix that explains the use of part. it was decided back as a bep and before merged in the main spec: as @agramfort mentioned, not having it would break datasets, especially raw MEG files that were saved to back story is that |
We have a competing use for If this is just an accommodation of a file format limitation, then perhaps hacking the extension would make more sense... What about GIFTI and CIFTI-2 files have double-extensions, so it would not be unique to FIF, although it would be a BIDS innovation rather than something determined by the FIF standard itself. |
I would also vote for wrapping format limitations into the format file extension. Different formats (and tools) could support different ways on how they expect split files to be annotated/specified. is every such split .fif file is still a "legit" .fif file? then I would consider git-annex behavior of not including
But bids tools and validator should explicitly account for format-appropriate extension and allow for multiple files in such cases. Hopefully none of the compressors etc would use |
The files are "legit" on their own. You can separate them and they can be read in separately. However, the full recording contains both the files. The first file contains a pointer to the second file. |
Not sure what to suggest on the git-annex side. I am not knowledgable enough
|
I think the question RE git-annex (correct me if I'm wrong @yarikoptic) is whether a tool that sees Given that it's been fine with the But how would the electrophys community (or at least MNE-Python) feel about moving |
I thought the suffix has to correspond to the modality. wouldn't we need to |
Oh sorry, I meant extension, not suffix... |
But how would the electrophys community (or at least MNE-Python) feel
about moving part* to the suffix? Would this cause any significant problems?
everything is possible. We can do changes / deprecations etc.
can you give me an example of what would be better?
|
for reference, from the Elekta Neuromag Data Acquisition manual: If the recorded raw data file size becomes larger than a predefined value (in normal configuration 2GB minus 100MB), the data file is closed and a new continuation file is created. The exact size of the file can vary somewhat depending on how the internal blocks exceed the limit. When this kind of a split measurement is saved, the first segment is given the name which user writes into the saving dialog. Rest of the segments have the same name with a dash and a running number appended (numbering starts from one). So the Neuromag software itself generates files
According to https://mne.tools/dev/generated/mne.preprocessing.maxwell_filter.html this is also what the MNE software supports. An example of this with |
I am fine switching MNE to _meg-%d.fif convention although I though that
ending with _meg.fif was mandatory.
… |
The suggestion above was |
sure why not. Just make sure you don't rename the files directly as file
names are written in each fif file to follow to the next part.
… |
yeah you'd have to IO in MNE, MNE-BIDS + any other software which writes these files + datasets which already contain these files. And update the validator (although it doesn't seem to have checks for this at the moment). It's possible but is it really necessary? Perhaps better to do it earlier than later ... |
would this be part of a major/minor version release update to BIDS? if not, this pattern will break datasets with split files. I echo what @jasmainak says. it would be OK for just reasons, and a clear plan. let's try to mitigate breakage as much as possible. |
I hope the fix one way or another should be a part of the next release. Note that the Proper solution would be to either accept
AFAIK - not . See e.g. bids-standard/bids-validator#683 . If validation is of a concern, such files could simply be ignored via |
But this is a separate issue. The entity table was not updated when merging MEG/EEG and this was fixed only afterwards: #198. If documentation is an issue, we can fix this
No, but
What do you mean? How can you simply ignore split files in the standard? This is a large fraction of MEG recordings ... |
So you know what to do, but might need to reach out to beps mentioned above to see if harmony could be reached so it wouldn't become a stumbling point for them |
I worry a bit that this issue is getting larger than it needs to be - or than efficient for handling it. I re-read https://bids-specification.readthedocs.io/en/stable/99-appendices/06-meg-file-formats.html#neuromagelektamegin On the one hand (1) there is a piece of example documentation in the appendix which appears inconsistent with the actual spec, i.e. 1 could be resolved by removing the specific lines from the appendix. Note that this might not solve 2. But see (*) 2 needs to be resolved by looking at the wider ecosystem and involving (perhaps) the company, since we follow the manufacturers format specification and we want to ensure that files in BIDS are compatible with all software, not only MNE-Python but also Curry, BESA, FIeldTrip, BrainStorm, etc. Regarding (*): it is not clear to me in the current appendix why it states "In the case of data runs exceeding 2Gb, the data is stored in two separate files ..." (which appears consistent with ther manufacturers specification) and then later the same issue seems to be solved by specifying it as "Naming convention ...". Why is the latter part needed? Would (1) and (2) not both be resolved if we remove these (apparently stray) lines:
? |
I also re-read the neuromag elekta megin section and also stumbled over this part:
So it seems to me like both of these would be possible:
where the latter is recommended. Also, looking at the regexp in the validator for MEG: https://github.com/bids-standard/bids-validator/blob/1888b87268c8f4bff26e073a96191dccca137e60/bids-validator/bids_validator/rules/file_level_rules.json#L205 ... it seems like Overall I think that And closely look at new BEPs that want to make use of "part" ... and whether that makes sense at all. |
And to answer @robertoostenveld
So as I also conclude in #429 (comment), I think improving the docs and staying with |
With all of the above, I think the way forward is just to continue with #424 which would introduce |
Considering the NIDM Terms project as mentioned by @dbkeator in #423 , I think it would be unfavorable to use The current use for MEG is to separate different parts of a long recording (that due to file size limitations is in separate files). The proposed use in #424 would be for magnitude and phase, but simultaneously acquired and each not interpretable without the other. To me those are rather different. Just for brainstorming sake and to get a feel for this... Consider that we would do this for combined MEG/EEG recordings
Or combined recordings in iEEG
Or combined video (does not exist in BIDS yet, but might be added at one point)
Or combined eye-tracking
|
I generally agree with @robertoostenveld that a modality-sensitive meaning to With #460, it seems we're rejecting the proposal to move the multi-part FIF hack into the extension and instead making From the above discussion, I didn't get the impression that the consensus was that this was irrevocable, but a bit painful for some specific tooling. So I just want to clarify before we push forward and make it definitely a breach of backwards compatibility to change the meaning of |
Hi guys, as mentioned by @robertoostenveld we want to ensure that this proposal is supported by all software. Yet, my impression is that |
Well, in case of mne support in software happened before specification in bids. edit: replaced used by mistake #460 with #424. #460 is the clarification for MEG/EEG |
not really, the |
The effect of this error was to hide this consequential aspect of the BEP from all but the most diligent reviewers. BEP-001 has been around a long time, and had |
it was only in a single example without any description in the text etc. With such a presence I could call any typoed field to be a "part of the specification" ;-) |
sounds good, +1 to clarify |
BTW, I was thinking about those while looking at #460, but those seems to would be represented as a suffix re @jasmainak
clarification: in my comment |
So currently I see these three options:
Some of my thoughts:
|
Thank you @sappelhoff for the nice summary! Let me just comment on
AFAIK - most people wouldn't care enough since would not have concrete use cases. Only MEEG folks would, biased by "it is already how we do it".
I guess they could. Any suggestions on a "word"? only Re all the pointers to #423 you reference as problematic for |
thanks for the constructive discussion and good summary of viewpoints.
I believe this is more fundamental than "MEEG folks versus fMRI folks". Based on principles we cannot have the situation that extensions to the specification invalidate previous versions of the spec (unless we switch from 1.x to 2.x). Some people shouting louder, versus other people not (visibly) caring enough, would still not justify making such changes. We will have to live with the situation that certain BIDS "terms" (like entities) of the specification are claimed first by one domain, causing them not to be available in another domain. Reuse of entities/terms is nice and desirable, replacing their definition is incompatible, making their definition context-dependent is undesirable. That being said, since Regarding solution 1: I don't see a trivial way of recommending a different way to format split other than the current one (using Perhaps the best wisdom would be to go for option 3 and specify somewhere that |
|
There is also @effigies's suggestion to move if to the suffix #429 (comment)
|
I guess the use of |
If you move the |
But actually, before we decide on this:
If (based on "2") the Neuromag users/representatives are happy with using |
BTW I don't like the idea of coding metadata in the suffix, as in |
If we are considering the general case where files might be split for space reasons, and not a format specific hack, then I agree. As currently written, this is a way to accommodate FIF limitations, which seems appropriate to the extension, IMO. |
if we were to replace |
TL;DR -> let's fix this
I think it'd be best to have an entity that splits large files into several chunks and to use that entity not only for MEG, but for all kinds of files that need to be split. Practically though, the currently most used application for this would be FIFF files. IMHO entities to split across features may be considered separately from this "bug fix" that is the topic of this issue ( At a first glance though, I don't like "overloading" the entities -> taken to the extreme, the entities lose their original purpose of being a quick to grasp and intuitive explanation of the file content. If the entities can take on several meanings depending on contexts (e.g. split over time OR features) then that'd be a step towards the extreme I mentioned.
I trust that fieldtrip or MNE-Python folks have a good grasp of the situation, so I don't think this is necessary. |
In the case of renaming From the FieldTrip-perspective, I am the only one here in the discussion. However, MNE-Python is more used by Neuromag labs, so perhaps some MNE contributors can help to decide here. Let me ping @agramfort, @dengemann, @jasmainak, @larsoner for this. |
This issue #429 has now branched out across many repos, issues, and PRs. This post is intended as a short summary.
|
you're the best @sappelhoff :) |
@robertoostenveld I'm happy with the solution and the other MNE developers have also expressed their support to this proposal. |
Very glad to see the constructive community involvement on the intersections between BIDS and analysis software. That is where the wisdom of the crowd can be found! |
and nowhere described otherwise:
$> git grep _part src/99-appendices/06-meg-file-formats.md:sub-control01_ses-001_task-rest_run-01_part-01_meg.fif src/99-appendices/06-meg-file-formats.md:sub-control01_ses-001_task-rest_run-01_part-02_meg.fif
and it is to annotate two parts of the split longer file. In #371 I mention e.g. that
_part
was (or still is) suggested in multiple BEPs (BEP001 (multiple contrasts) and BEP004 (SWI)) but likely for a different purpose.I am not sure if we have ATM in BIDS any agreement on annotation of multiple portions (split arbitrarily in temporal dimension) of a larger thing, besides a
_run
which is different semantically. That is why I was not sure how to address this issue.attn: @bids-standard/derivatives-electrophys
The text was updated successfully, but these errors were encountered: