Make memory mapped behavior match read_samples #60

Teque5 · 2024-05-30T18:55:44Z

When reading samples from signals the current implementation is a bit quirky and deviates from expectations when reading memory mapped samples from a file IF those samples need to be scaled.

Consider the case where we read the sigmf logo from the main repository. This is a 2-channel real-valued audio file with samples stored as 16-bit integers.

>>> logo = sigmf.sigmffile.fromfile('sigmf_logo')

>>> logo.read_samples(count=3)
array([[-3.0517578e-05,  0.0000000e+00],
       [ 6.1035156e-05,  0.0000000e+00],
       [-6.1035156e-05,  0.0000000e+00]], dtype=float32)

>>> logo[0:3]
memmap([[-1,  0],
        [ 2,  0],
        [-2,  0]], dtype=int16)

This happens because when using read_samples the scale factor is applied, but this is not done for the memory map.

I'm not sure the exact best solution for this, but I think we should fix #15 simultaneously since it will require tinkering with the same code.

Solutions I propose:

Leave as-is
When accesing the memory-map of a file that requires scaling, return of a copy of the data instead (by using read_samples probably)
When accessing a memory-map return a scale parameter along with the data? or maybe a warning?

Fixing #15 I believe requires using the offset kwarg of np.memmap.

The text was updated successfully, but these errors were encountered:

liambeguin · 2024-06-06T23:28:32Z

Hi @Teque5, I've run into the same kind of problem with sigmf archives... I was hoping #42 was going to fix this, but no..

On my end the problem is that functions like read_samples_in_capture() assume that we have a data_file to access to run things like os.path.getsize(). IMO it would be really nice to rework/consolidate SigMFFile.__init__, set_data_file(), and _read_datafile() to process user inputs (either a file, a buffer, any other type, ...) into a single internal representation of the data (maybe _memmap?). Then, each accessor can use that single "representation" and return whatever is needed.

This might also help support loading a non-conforming dataset?
Let me know what you think, I don't have a lot of time to spare on this, but I could try to help out.

Teque5 added the bug Something isn't working label May 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make memory mapped behavior match read_samples #60

Make memory mapped behavior match read_samples #60

Teque5 commented May 30, 2024

liambeguin commented Jun 6, 2024

Make memory mapped behavior match read_samples #60

Make memory mapped behavior match read_samples #60

Comments

Teque5 commented May 30, 2024

liambeguin commented Jun 6, 2024