-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add --unbuffered option to sbp2json.py #1400
Conversation
Quality Gate passed for 'libsbp-c'Kudos, no new issues were introduced! 0 New issues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to have a test of a very small message that just lived in buffer land so that it returns when --unbuffered is passed but not without that flag.
Also... I commented in line but I am unsure about this strategy of flushing the buffer by pointing it to a slice in a binary string. I don't know if the earlier memory will actually be freed that way. Do you have some explanation of python that can assuage me there or have some test to prove the buf that isn't part of the slice is garbage collected?
From what I can find online (e.g. the first answer to https://stackoverflow.com/questions/64871329/does-string-slicing-perform-copy-in-memory and https://discuss.python.org/t/string-slicing-in-python/25651) it seems that making a slice makes a copy of the original PyObject (except for the the special case where the slice corresponds to the entire string). So the |
To be clear: running with or without the |
TIL. I suppose that is why the other implementation uses memoryview. I do think using memory view should be possible here and you can see you are allocating several times per message, do you think this affects performance in a measureable way |
I ran the following tests a few times and consistently got results like this:
The pattern which I observed was as follows:
This makes sense to me, given that:
But at the end of the day this is all immaterial since previous profiling has shown that sbp2json.py spends the vast majority of its time in the call to Also, I do not think that a memoryview is relevant here since they can only be used to represent fixed sized objects (per the comment "Resizing is not allowed" in https://docs.python.org/dev/library/stdtypes.html#typememoryview). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Thanks for walking me through the PR!
Description
Add a new
--unbuffered
command line option to sbp2json.py which enables a mode where it reads and parses messages one-at-a-time. The existing behaviour (which remains the default) waits for 4096 bytes before parsing, which means that when decoding real-time streams which only output messages intermittently, there are often large pauses where no output is produced, followed by a large batch of decoded messages at once.The rust version of sbp2json does not have this issue. However, there are various use cases (e.g. embedded platforms) where it is difficult (if not impossible) to build the rust tools, but the python version works just fine.
API compatibility
Does this change introduce a API compatibility risk?
No. The previous behaviour remains the default.
API compatibility plan
N/A
JIRA Reference
https://swift-nav.atlassian.net/browse/BOARD-XXXX