Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simple proposal for buffer_id in rawio #1544

Merged
merged 15 commits into from
Oct 11, 2024

Conversation

samuelgarcia
Copy link
Contributor

A very simple proposal for spikeglxrawio
@h-mayorquin
@zm711

Comment on lines +151 to +153
buffer_id = stream_name
buffer_name = stream_name
signal_buffers.append((buffer_name, buffer_id))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment the mapping buffer to stream is one to one.
But after this PR we will be able to split buffer into logical stream super easily.

@zm711
Copy link
Contributor

zm711 commented Sep 3, 2024

In your example the idea would be that we move the sync channel into a new stream right? So it will keep the buffer but you'll assign a new stream ?

@h-mayorquin
Copy link
Contributor

Hey, Sam, what I had in mind last time we discussed was something like this that I think should be less concentrate the disruption in one PR and then we can work piece by piece on the formats:

#1545

@samuelgarcia
Copy link
Contributor Author

In your example the idea would be that we move the sync channel into a new stream right? So it will keep the buffer but you'll assign a new stream ?

In this PR the idea to have no change IOs. Only adding this buffer_id stuff in a very neutral way.

The next PR will have some changes for some IOs (plexon, spikeglx, ced) to split actual stream in more streams but with the same buffer.

@samuelgarcia
Copy link
Contributor Author

Hey, Sam, what I had in mind last time we discussed was something like this that I think should be less concentrate the disruption in one PR and then we can work piece by piece on the formats:

Your proposal is cool and easy but I think I prefer to be brave and explicitly add this features everywhere as a necessary implementation for rawios.

@h-mayorquin
Copy link
Contributor

Good. We covered in the discussion today:
The buffer will be undefined if:

  1. The stream is divided across multiple files. Example intan's one-file-per-channel
  2. There is an API between neo and the data which obfuscates the data layout. Example plexon2

We decide between two approaches:

  • Have a strict version where stream and buffer are orthogonal concepts.
  • Have a "nested-unless-edge-case" approach where the stream headers contain a reference fo their buffer unless the buffer is hard to define (cases 1 and 2 above).

I am weary of not going with the strict apporach but this will faciliate some work on Sam and I could not find a concrete example where this could cause problems. In such a case "practicallity should beat purity" and we will stick to this.

@zm711
Copy link
Contributor

zm711 commented Sep 4, 2024

Good. We covered in the discussion today:
The buffer will be undefined if:

I think what I would add to this so we don't forget that we didn't completely settle on a rigid buffer definition so that we have some flexibility as we implement it. Buffers ought to have same dtype, sampling rate, but something like same file vs same memmap was left in the air

  • plexon 1 for example is one file, but multiple memmaps.
  • Sam mentioned thinking about intan header-attached which is one file, one memmap but with the block structure doesn't quite fit with what he was imagining for a buffer (when you hopped off @h-mayorquin).

Thanks for summary of the discussion @h-mayorquin! I think keeping our documentation up will help us when we go back and think about our choices!

@samuelgarcia
Copy link
Contributor Author

Thanks for this summary of internal discussion @h-mayorquin and @zm711.

@samuelgarcia
Copy link
Contributor Author

@zm711 @h-mayorquin this is ready on my side.

@zm711
Copy link
Contributor

zm711 commented Sep 5, 2024

Looks like raw binary signal wasn't done quite right (tests failing). Want to fix that? I'll look over this in parallel.

@@ -56,6 +56,7 @@
BaseRawIO,
_signal_channel_dtype,
_signal_stream_dtype,
_signal_buffer_dtype,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sneaky. could've had them alphabetized, but that would be too conventional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets leave a chance to the anarchy style

@@ -399,7 +401,10 @@ def _parse_header(self):
ext_header.append(d)

if len(ext_header) > 0:
signal_streams.append((f"nsx{nsx_nb}", str(nsx_nb)))
buffer_id = stream_id = str(nsx_nb)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll be honest I don't know blackrock at all. I'll trust you on this one.

@@ -111,12 +112,18 @@ def _parse_header(self):
# In short `_parse_header()` can be slow but
# `_get_analogsignal_chunk()` needs to be as fast as possible

# create fake signal streams information
# create fake signal streams and buffer information
# a buffer is a group of channels that are in the same buffer for instance hdf5 or binary : this is optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we want to weird this differently. buffer is not optional in the developer sense because it has to be a value or "" but is optional in the sense that you can give an "" string. So maybe we make that clearer. Let me think about this while you fix the tests.

signal_streams["name"][stream_index] = name
# zach I need you help here
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to say why we don't count these as buffers.

So for header-attached it is because the data is set up in blocks so there is no continous signal
and for one-file-per-channel it is because the stream is split among files.

signal_streams = np.array([])
signal_streams = []
signal_streams = np.array(signal_streams, dtype=_signal_stream_dtype)
# no buffer handling in this format because one channel per file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c'est parfait! That's a perfect explanation of why not.


if missing_sev_channels:
warnings.warn(f"Could not identify sev files for channels {missing_sev_channels}.")

signal_streams = np.array(signal_streams, dtype=_signal_stream_dtype)
signal_channels = np.array(signal_channels, dtype=_signal_channel_dtype)
# buffer concept here, data are spread per channel and paquet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Qu'est-ce que c'est un paquet la? channel are files and the paquet is for folder? or packets of info meaning like internal blocks of data like data packets?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is what I meant

@h-mayorquin
Copy link
Contributor

I will take a look once the tests are passing.

@zm711
Copy link
Contributor

zm711 commented Sep 9, 2024

Now plexon2 tests are failing. Heberto has done a bunch of updates. He has another PR that we need to review (but I need to check if it is plexon1 or plexon2). Do we want to get his last PR done and then you can finalize this or do you want to finish this first. There are slightly organizational changes occurring that I think will keep leading to test failures. Up to you both from my perspective.

@zm711
Copy link
Contributor

zm711 commented Sep 13, 2024

I think with the latest boost to plexon2 once you fix conflicts and merge from main I hope the plexon2 won't give us any troubles.

@samuelgarcia
Copy link
Contributor Author

@zm711 : it would be nice to merge this soon.
So I can continue working on #1513.

@zm711
Copy link
Contributor

zm711 commented Oct 11, 2024

scanning through this it looks good. Once tests pass I'll do a final read through, so let's target to merge this by end of weekend. We can ping @h-mayorquin for a read through too, hopefully today if he has time?

@@ -12,3 +12,7 @@ class TestNeuroNexusRawIO(
rawioclass = NeuroNexusRawIO
entities_to_download = ["neuronexus"]
entities_to_test = ["neuronexus/allego_1/allego_2__uid0701-13-04-49.xdat.json"]


if __name__ == "__main__":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hahahahah

Copy link
Contributor

@zm711 zm711 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works for me. There are a couple of comment touch ups and some clean up we should do as we push forward, but I agree it is too hard to keep up the parallel structure for so long.

Copy link
Contributor

@h-mayorquin h-mayorquin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGMT. I added some comments that I think will be helpful for future developers.

I still think that including the buffer in the data signal_stream might lead to problems in the future and I would prefer the concepts to be decoupled (the schema to be normalized in the database sense and make the signal channel array/table the normalized table).

But, what is important for data extraction is that we can decouple the stream and the buffer and divide the streams "logically" and this PR allows us to do this:

Fine by me.

neo/rawio/baserawio.py Outdated Show resolved Hide resolved
neo/rawio/edfrawio.py Show resolved Hide resolved
neo/rawio/baserawio.py Outdated Show resolved Hide resolved
@zm711
Copy link
Contributor

zm711 commented Oct 11, 2024

If Sam doesn't respond I'll package up your edits into a commit and push them, then once final tests pass we can merge it.

Definitely down for another call at some point to discuss your desire for decoupling more !

Copy link
Contributor

@zm711 zm711 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding a couple doc fixes to be added with Heberto's

neo/rawio/plexon2rawio/plexon2rawio.py Outdated Show resolved Hide resolved
neo/rawio/examplerawio.py Outdated Show resolved Hide resolved
neo/rawio/examplerawio.py Outdated Show resolved Hide resolved
neo/rawio/examplerawio.py Outdated Show resolved Hide resolved
@zm711 zm711 merged commit 45a363d into NeuralEnsemble:master Oct 11, 2024
3 checks passed
@samuelgarcia
Copy link
Contributor Author

merci les amis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants