-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need field for specifying session start time accuracy #321
Comments
@ajtritt @bendichter what do you think about this idea? |
I'm using something like
for NWBv1 currently. And I will use the same for NWBv2. The accuracy could be just implemented by only writing out the digits you trust or? |
That would work if accuracy aligned with the segments of an ISO8601 string, but not if you know that the recording happened around 9:30am +/- 10 minutes because your acquisition computer is not connected to the internet and the clock is off by about that much. |
@rly ISO8601 supports durations so you could say that the starting time was 9:20 to 9:40. But I'm actually not sure that this is a viable route as most date parsing tools only support dates and times and not durations. E.g. https://dateutil.readthedocs.io/en/stable/parser.html does not seem to support durations. So why not have a |
@t-b Ah, I see what you mean. Sorry for misreading. Pandas supports ISO8601 durations (link), and we already import pandas, so we could use that to parse the durations. Representing an inaccuracy range of 10 minutes as either "P0DT0H10M0S" or 600.0 would be fine with me. Although we do have |
Just to be explicit, the proposal on the table is now:
- name: session_start_time
dtype: isodatetime
doc: ...
attributes:
- name: uncertainty
dtype: isoduration
doc: The uncertainty of the session start time. The actual session start time may be within
plus or minus the uncertainty value from the reported session start time. For example, if
it is known only that the session start time was sometime on July 15, 2019 in UTC-7, the
session start time should be July 15, 2019 at noon in UTC-7 ("2019-07-15T12:00:00-07:00"),
and the uncertainty should be 12 hours ("P0DT12H0M0S").
required: false |
CC @ln-vidrio |
I feel like the natural solution to the session start time accuracy issue is to let users just supply a date or a time without ms. I think the accuracy thing is overkill and will rarely be used correctly. |
I would prefer to use |
@bendichter Yeah, I also think this would rarely be used correctly. 99.9% of cases will be either:
In both cases, I think most users would not bother to enter an uncertainty value, and if they don't know the time or the second, it is most likely not important to the study. Also, acquisition systems do not know the accuracy of their clocks, so any files they create will have unknown uncertainty. I like @bendichter and @t-b 's suggestion of just supporting datetimes with reduced accuracy. ISO 8601 allows it - "For reduced accuracy, any number of values may be dropped from any of the date and time representations". You can't encode uncertainty values of +/- 3 hours, but it will cover 99.9% of use cases. Unfortunately, all datetime parsers confound missing values with zero values. We could write our own modified datetime reader/writer to differentiate those cases, so the uncertainty is documented in the file (non read/write uses of the datetimes would use 0s for missing values), but that could get messy. Thoughts? Thanks for all the feedback! I consider this a very minor issue. I say: if there is a need from users, then let's move forward with one option or the other (I lean toward creating a modified datetime string reader/writer). In the meantime, I'll close the issue. |
Hi, Is it possible to make |
@rly I don't really see this as a minor issue, as it comes us for us all the time in different ways:
In the end, these requirements force us to enter data which is often wrong. It would be so much better and simpler to allow:
If using |
This sounds good to me. @oruebel any thoughts? |
@lawrence-mbf What do you think? Do you think it would be possible to relax this on the MatNWB side as well? |
The option to allow |
I get the logic of the principle of how it could be seen as required (experiment had to take place somewhere; if I think what we take more issue with is that requiring it means the APIs like PyNWB (IDK what MatNWB does) then auto-attach it to the The syntax for attaching a proper So overall it might just be better to loosen it to be an optional detail to include along with the other loosened resolutions of the |
Thanks for the clarification @CodyCBakerPhD, that makes sense. It sounds like independent of this schema issue, there is an issue in PyNWB in that we: a) zone information should not be automatically added and b) specifying date information should be simpler. For a) we'd probably need to add/update logic in the NWBFile class in PyNWB. For b) we could look at the ZoneInfo module that was added in Python 3.9 to see if that helps simplify the code. If we decide to allow "datetime without timezone" then I would still suggest that we look at enhancing PyNWB (NeuroConv, GUIDE etc.) to make it easier and encourage the use of proper timezone information. |
This should be doable on MatNWB side. |
The
NWBFile
datasetsession_start_time
is designed to represent the real-world start time of the session. Based on discussions with @oruebel, I see the value in this field is that you can tell what time of day a dataset was collected, if it occurred at the same time as an earthquake, etc.But, most systems are not going to store times totally accurately with respect to the world clock at atomic resolution. How do we represent the inaccuracy of the session start time? Most datasets supply precision down to the second. Some datasets, for one reason or another, only know what day that the dataset was collected and thus enter 00:00:00 for the time, e.g. for this DataJoint to NWB converter:
https://github.com/vathes/DJ-NWB-Li-Daie-2015-2016/blob/ae69335a23e6a3c0c3809a54de5480baf8848712/pipeline/export/datajoint_to_nwb.py#L23
@oruebel and I think we should add a float attribute on
session_start_time
calledaccuracy
that represents the +/- amount of time to which thesession_start_time
is estimated to be inaccurate. e.g., if the user thinks the value is accurate down to the second, then they should enter 0.5 for the accuracy. If they only know the day, then they should enter noon of that day as the start time and 43200 (12x60x60) for the accuracy. It would be optional, the default value would be np.nan, and we should encourage users to specify this value in tutorials and our best practices document.This is a minor issue, but we should address it, or else, the
session_start_time
field will not have a consistent usage and meaning across files.The text was updated successfully, but these errors were encountered: