Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allocate buffers more efficiently #2

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

sporksmith
Copy link

@sporksmith sporksmith commented May 5, 2021

  • Allocate buffer with desired capacity instead of appending one byte at
    a time, which can result in repeated re-allocations.
  • Don't initialize buffer to zeros when we're going to overwrite the
    contents anyway.

Experimentally, these changes make a very substantial performance
improvement in Debug builds, though little noticeable difference in
Release builds. e.g. in an experiment with my application that uses
this crate for logging, these changes bring total run time down from
8.64s to 2.45s in Debug builds.

* Allocate buffer with desired capacity instead of appending one byte at
  a time, which can result in repeated re-allocations.
* Don't initialize buffer to zeros when we're going to overwrite the
  contents anyway.

Experimentally, these changes make a very substantial performance
improvement in Debug builds, though little noticeable difference in
Release builds. e.g. in my application that uses this crate for logging,
these changes bring total run time down from 8.64s to 2.45s in Debug
builds.
@sporksmith
Copy link
Author

Friendly ping - any chance of this getting merged?

Despite that we ultimately want to drop the NULL byte, we need to
include space for it in the buffer provided to libc's vsnprintf;
otherwise it may replace the last byte of the string with a NULL byte.

This also simplifies things by providing a zero-length buffer for the
first call to vsnprintf, and then allocating exactly the required space.
This gives only one code path to maintain (as opposed to 2-3 before
depending if the initial buffer was less than, equal to, or greater than
the initial size). While this means paying to format the string twice,
it lets us drop `shrink_to_fit` of the final string, which can be a
relatively expensive call to the allocator.

Fixes shadow/shadow#1432
vsnprintf *should* return the exact number of characters that will be
printed with the given arguments. In Shadow, I've seen at least one
instance of it wanting more space on the second call, though. This could
be a bug in Shadow (maybe another thread mutating one of the arguments
in between calls, but is difficult to debug.

Probably better to be robust here and just reallocate if needed than to
crash.
Copy link

@couchand couchand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The strategy may be reasonable, however the implementation as written is not sound.

I'm not the maintainer, but I would also recommend separating out the cargo fmt changes into a separate commit or PR to make review easier.

if character_count >= buffer.len() {
let new_len = character_count + 1;
buffer.reserve_exact(new_len - buffer.len());
// SAFETY: Any bit pattern is a valid u8, and we reserved the space.
Copy link

@couchand couchand Jan 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not sound, despite what the comment claims. It is not sufficient for any bit pattern to be a valid u8, as uninitialized memory is not even guaranteed to be a fixed bit pattern. This is UB, which can be verified by trying this example with Miri: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=4fad0134c08d9f8a6272ce3c77d2c3f9

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, thanks for pointing this out. I learned this is incorrect later.

Happy to drop this part if I can get in the fix to avoid repeated allocations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants