io: increase MAX_BUF
from 16384
to 2MiB
#5397
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
While working on Spacedrive, we were noticing some oddities during file reading in regards to encryption and decryption.
We use a stream reader where 1MiB of the source file is read, encrypted, written and this repeats until the end. We arguably need to use
read
for this, as opposed toread_exact
, as we need to know the exact amount of bytes read and guarantee that if the full 1MiB isn't read, the buffer still contains valid data.I noticed some oddities while calling
read
, and that it was only reading 16KB at a time. This would have been fine, but this results in 64read
calls until the 1MiB buffer is populated.Solution
The 64 read calls to populate a 1MiB buffer turned out to be extremely slow. By increasing the buffer size to 2MiB, we were able to bring encryption time down from 24~ seconds to 8.7~ seconds.
I have done some testing regarding the 2MiB value. 1MiB seemed to be a little too low, and gave us a time-to-encrypt of 15~ seconds. 4MiB gave us 8.7~ seconds also, but 2MiB should be cheaper and more universal.
This value could likely be configurable somewhere, but I'm not too sure how the API would work there.
On another note, 2MiB may be too large for some platforms and I'm not too sure if this is allocated on the stack or heap. In the event it's stack allocated, we may be better off increasing the default to 32/64KB and allowing it to be configurable.