-
-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-117151: optimize BufferedWriter(), do not buffer writes that are the buffer size #118037
Conversation
… are equal to the buffer size. avoid extra memory copy. BufferedWriter() was buffering calls that are the exact same size as the buffer. it's a very common case to read/write in blocks of the exact buffer size. it's pointless to copy a full buffer, it's costing extra memory copy and the full buffer will have to be written in the next call anyway.
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
Regarding the CLA, the CLA was signed by my organization yesterday. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
It can potentially affect the performance of BufferedRandom in rare cases, but I think that they are too rare, and it worst cases it only affects performance, not the behavior. If it affects behavior, then there are other existing bugs in the code.
How much difference in speed is caused by this PR? If it is more than say 10%, it makes sense to add a short NEWS entry. |
It's single percent difference, it might not be news worthy. |
I've reconfirmed the CLA with both emails. is it okay now? |
Thank you. |
gh-117151: optimize BufferedWriter(), do not buffer writes that are the buffer size
Hello,
I am having a look at the buffering code as part of the linked discussion, #gh-117151
One thing that was puzzling me is finding the buffering code slightly slower than the non-buffering code, given the same conditions, so I started having a deeper look into it.
Could I get a second pair of eyes on this old piece of code?
BufferedWriter()
is buffering calls that are the exact same size as the buffer.It's a very common case to read/write in blocks of the exact buffer size.
I think this might actually be the primary use case, reading in chunks from disk/network and writing to disk.
I think it shouldn't buffer in that case. It's costing an extra memory copy and the buffer will be full and have to be written in the next call anyway.
Thoughts?