_read_datafile() calls file close() on every partial dataset read #53

csylvain · 2024-03-01T16:20:30Z

Line 679 in c5d194d

fp.close()

I have been working on a project which operates wonderfully on a 16GB RAM laptop but has to be simplified if it will ever run successfully on a 512kB RAM Raspberry Pi Zero 2 W.

I see that should a dataset be read incrementally (reducing memory footprint), rather than all-at-once (non-issue on laptop with gobs of memory), each call of read_samples() in turn calls _read_datafile() which will perform an open(), seek(), and a close().

Perhaps the file management should be promoted to read_samples() and the parent SigMFFile class? Then the dataset reads could be performed with a single open() and one concluding close() with any number of seek() and read() in between ?

This enhancement suggestion would also make possible the use of mmap() and thus place the system memory management burden on the OS rather than the Python runtime.

gmabey · 2024-03-01T16:44:35Z

SigMFArchiveReader already uses mmap(), check it out. There definitely are gaps in the implementation of it and SigMFFile, contributions welcome. Glen

…

On Fri, Mar 1, 2024 at 9:20 AM CSylvain ***@***.***> wrote: https://github.com/sigmf/sigmf-python/blob/c5d194d5e659def926d25737baa7b6cbbb4887bd/sigmf/sigmffile.py#L679 I have been working on a project which operates wonderfully on a 16GB RAM laptop but has to be simplified if it will ever run successfully on a 512kB RAM Raspberry Pi Zero 2 W. I see that should a dataset be read incrementally (reducing memory footprint), rather than all-at-once (non-issue on laptop with gobs of memory), each call of read_samples() in turn calls _read_datafile() which will perform an open(), seek(), and a close(). Perhaps the file management should be promoted to read_samples() and the parent SigMFFile class? Then the dataset reads could be performed with a single open() and one concluding close() with any number of seek() and read() in between ? This enhancement suggestion would also make possible the use of mmap() and thus place the system memory management burden on the OS rather than the Python runtime. — Reply to this email directly, view it on GitHub <#53>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABVTOUA6C5G6C2SPZLJBN2TYWCTFXAVCNFSM6AAAAABECA5X6GVHI2DSMVQWIX3LMV43ASLTON2WKOZSGE3DGNZTGQYDMOA> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

csylvain · 2024-03-01T20:59:49Z

I hadn't looked at SigMFArchiveReader because my dataset is just one file of pre-computed IQ samples. Thanks for calling my attention to it.

I see archive reading uses Numpy's memmap(): "NumPy’s memmap’s are array-like objects. This differs from Python’s mmap module, which uses file-like objects." Numpy documentation Python documentation
However, read_samples() uses _read_datafile() which looks like it is using a file-like mechanism.

I can already report with the existing read_samples() implementation, partial reads works smoothly on the 512kB RAM device, where an all-at-once read suffers from random TX underruns.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

_read_datafile() calls file close() on every partial dataset read #53

_read_datafile() calls file close() on every partial dataset read #53

csylvain commented Mar 1, 2024

gmabey commented Mar 1, 2024 via email

csylvain commented Mar 1, 2024

_read_datafile() calls file close() on every partial dataset read #53

_read_datafile() calls file close() on every partial dataset read #53

Comments

csylvain commented Mar 1, 2024

gmabey commented Mar 1, 2024 via email

csylvain commented Mar 1, 2024