-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
simple proposal for buffer_id in rawio #1544
simple proposal for buffer_id in rawio #1544
Conversation
buffer_id = stream_name | ||
buffer_name = stream_name | ||
signal_buffers.append((buffer_name, buffer_id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment the mapping buffer to stream is one to one.
But after this PR we will be able to split buffer into logical stream super easily.
In your example the idea would be that we move the sync channel into a new stream right? So it will keep the buffer but you'll assign a new stream ? |
Hey, Sam, what I had in mind last time we discussed was something like this that I think should be less concentrate the disruption in one PR and then we can work piece by piece on the formats: |
In this PR the idea to have no change IOs. Only adding this buffer_id stuff in a very neutral way. The next PR will have some changes for some IOs (plexon, spikeglx, ced) to split actual stream in more streams but with the same buffer. |
Your proposal is cool and easy but I think I prefer to be brave and explicitly add this features everywhere as a necessary implementation for rawios. |
Good. We covered in the discussion today:
We decide between two approaches:
I am weary of not going with the strict apporach but this will faciliate some work on Sam and I could not find a concrete example where this could cause problems. In such a case "practicallity should beat purity" and we will stick to this. |
I think what I would add to this so we don't forget that we didn't completely settle on a rigid buffer definition so that we have some flexibility as we implement it. Buffers ought to have same dtype, sampling rate, but something like same file vs same memmap was left in the air
Thanks for summary of the discussion @h-mayorquin! I think keeping our documentation up will help us when we go back and think about our choices! |
Thanks for this summary of internal discussion @h-mayorquin and @zm711. |
@zm711 @h-mayorquin this is ready on my side. |
Looks like raw binary signal wasn't done quite right (tests failing). Want to fix that? I'll look over this in parallel. |
@@ -56,6 +56,7 @@ | |||
BaseRawIO, | |||
_signal_channel_dtype, | |||
_signal_stream_dtype, | |||
_signal_buffer_dtype, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sneaky. could've had them alphabetized, but that would be too conventional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets leave a chance to the anarchy style
@@ -399,7 +401,10 @@ def _parse_header(self): | |||
ext_header.append(d) | |||
|
|||
if len(ext_header) > 0: | |||
signal_streams.append((f"nsx{nsx_nb}", str(nsx_nb))) | |||
buffer_id = stream_id = str(nsx_nb) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll be honest I don't know blackrock at all. I'll trust you on this one.
neo/rawio/examplerawio.py
Outdated
@@ -111,12 +112,18 @@ def _parse_header(self): | |||
# In short `_parse_header()` can be slow but | |||
# `_get_analogsignal_chunk()` needs to be as fast as possible | |||
|
|||
# create fake signal streams information | |||
# create fake signal streams and buffer information | |||
# a buffer is a group of channels that are in the same buffer for instance hdf5 or binary : this is optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we want to weird this differently. buffer is not optional in the developer sense because it has to be a value or ""
but is optional in the sense that you can give an ""
string. So maybe we make that clearer. Let me think about this while you fix the tests.
signal_streams["name"][stream_index] = name | ||
# zach I need you help here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to say why we don't count these as buffers.
So for header-attached it is because the data is set up in blocks so there is no continous signal
and for one-file-per-channel it is because the stream is split among files.
signal_streams = np.array([]) | ||
signal_streams = [] | ||
signal_streams = np.array(signal_streams, dtype=_signal_stream_dtype) | ||
# no buffer handling in this format because one channel per file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
c'est parfait! That's a perfect explanation of why not.
neo/rawio/tdtrawio.py
Outdated
|
||
if missing_sev_channels: | ||
warnings.warn(f"Could not identify sev files for channels {missing_sev_channels}.") | ||
|
||
signal_streams = np.array(signal_streams, dtype=_signal_stream_dtype) | ||
signal_channels = np.array(signal_channels, dtype=_signal_channel_dtype) | ||
# buffer concept here, data are spread per channel and paquet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Qu'est-ce que c'est un paquet la? channel are files and the paquet is for folder? or packets of info meaning like internal blocks of data like data packets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this is what I meant
I will take a look once the tests are passing. |
Now plexon2 tests are failing. Heberto has done a bunch of updates. He has another PR that we need to review (but I need to check if it is plexon1 or plexon2). Do we want to get his last PR done and then you can finalize this or do you want to finish this first. There are slightly organizational changes occurring that I think will keep leading to test failures. Up to you both from my perspective. |
I think with the latest boost to plexon2 once you fix conflicts and merge from main I hope the plexon2 won't give us any troubles. |
scanning through this it looks good. Once tests pass I'll do a final read through, so let's target to merge this by end of weekend. We can ping @h-mayorquin for a read through too, hopefully today if he has time? |
@@ -12,3 +12,7 @@ class TestNeuroNexusRawIO( | |||
rawioclass = NeuroNexusRawIO | |||
entities_to_download = ["neuronexus"] | |||
entities_to_test = ["neuronexus/allego_1/allego_2__uid0701-13-04-49.xdat.json"] | |||
|
|||
|
|||
if __name__ == "__main__": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hahahahah
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works for me. There are a couple of comment touch ups and some clean up we should do as we push forward, but I agree it is too hard to keep up the parallel structure for so long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGMT. I added some comments that I think will be helpful for future developers.
I still think that including the buffer in the data signal_stream might lead to problems in the future and I would prefer the concepts to be decoupled (the schema to be normalized in the database sense and make the signal channel array/table the normalized table).
But, what is important for data extraction is that we can decouple the stream and the buffer and divide the streams "logically" and this PR allows us to do this:
Fine by me.
If Sam doesn't respond I'll package up your edits into a commit and push them, then once final tests pass we can merge it. Definitely down for another call at some point to discuss your desire for decoupling more ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adding a couple doc fixes to be added with Heberto's
Co-authored-by: Heberto Mayorquin <[email protected]>
merci les amis |
A very simple proposal for spikeglxrawio
@h-mayorquin
@zm711