Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible issue shrinking BitBox #55

Closed
kulp opened this issue Mar 27, 2020 · 7 comments
Closed

Possible issue shrinking BitBox #55

kulp opened this issue Mar 27, 2020 · 7 comments

Comments

@kulp
Copy link

kulp commented Mar 27, 2020

First, thank you for bitvec.

I found a strange problem (freeing of unallocated memory) when shrinking a BitVec to a BitBox, but I can reproduce it only on macos. I have a hunch the problem may not be in bitvec at all, but possibly in something Rust-specific. Any guidance will be appreciated.

I created a reproducible minimal testcase in kulp/bitvec-debugging. Github Actions shows failure on macos, whereas Ubuntu succeeds (the Windows job tends to be canceled because it does not finish before macos fails). On my Mac OS 10.14.6 machine, test failure (with bitvec 0.17.3) looks like this :

$ cargo +1.42.0 test
    Finished test [unoptimized + debuginfo] target(s) in 1.64s
     Running target/debug/deps/debug_bitbox-776d3300cd651276

running 1 test
debug_bitbox-776d3300cd651276(79649,0x700005796000) malloc: *** error for object 0x7fb526800000: pointer being freed was not allocated
debug_bitbox-776d3300cd651276(79649,0x700005796000) malloc: *** set a breakpoint in malloc_error_break to debug
error: test failed, to rerun pass '--lib'

Caused by:
  process didn't exit successfully: `/Users/kulp/Documents/Code/Rust/debug_bitbox/target/debug/deps/debug_bitbox-776d3300cd651276` (signal: 6, SIGABRT: process abort signal)

A backtrace from malloc_error_break shows alloc::raw_vec::shrink_to_fitoccurring during core::ptr::drop_in_place. The dynamically-nearest bitvec code is this :

frame #9: 0x000000010000579e debug_bitbox-776d3300cd651276`bitvec::boxed::ops::_$LT$impl$u20$core..ops..drop..Drop$u20$for$u20$bitvec..boxed..BitBox$LT$O$C$T$GT$$GT$::drop::ha5df38b3a7bb05c2(self=0x000070000dd2e430) at ops.rs:129:7
   126 			let ptr = self.as_mut_slice().as_mut_ptr();
   127 			let len = self.as_slice().len();
   128 			//  Run the `Box<[T]>` destructor.
-> 129 			drop(unsafe { Vec::from_raw_parts(ptr, 0, len) }.into_boxed_slice());
   130 		}
   131 	}
   132

I wrote a script that demonstrates the problem exists across a broad swathe of Rust versions and bitvec versions. All x.0 versions of Rust from 1.36.0 through 1.42.0 and all x.0 versions of bitvec from 0.11.0 through 0.17.0 (and 0.17.3 as well) demonstrate the issue.

For the real code in which this issue was first discovered, my only workaround is to switch to Linux, which is possible for the short term but rather worrying.

Do you think it plausible that this is a Rust compiler or library issue ? if so, can you suggest a plan of attack for chasing it down further, or reporting it elsewhere ?

@myrrlyn
Copy link
Collaborator

myrrlyn commented Mar 27, 2020

That's a really interesting cutoff number; I wonder what's up with that.

Anyway, it's my bug and I know exactly why it happened. Can you check if current master, commit c5587f0, fixes it?

@kulp
Copy link
Author

kulp commented Mar 27, 2020

That commit does not appear to have resolved the issue :

$ cargo test
  Downloaded funty v1.0.1
   Compiling funty v1.0.1
   Compiling bitvec v0.18.0 (https://github.com/myrrlyn/bitvec.git#c5587f0c)
   Compiling debug_bitbox v0.1.0 (/Users/kulp/Documents/Code/Rust/debug_bitbox)
    Finished test [unoptimized + debuginfo] target(s) in 18.34s
     Running target/debug/deps/debug_bitbox-f5adc7e290da82aa

running 1 test
debug_bitbox-f5adc7e290da82aa(80083,0x7000068bf000) malloc: *** error for object 0x7fdfb2800000: pointer being freed was not allocated
debug_bitbox-f5adc7e290da82aa(80083,0x7000068bf000) malloc: *** set a breakpoint in malloc_error_break to debug
error: test failed, to rerun pass '--lib'

Caused by:
  process didn't exit successfully: `/Users/kulp/Documents/Code/Rust/debug_bitbox/target/debug/deps/debug_bitbox-f5adc7e290da82aa` (signal: 6, SIGABRT: process abort signal)

@myrrlyn
Copy link
Collaborator

myrrlyn commented Mar 27, 2020

Neat. I really wish I knew how to run ASAN, or to have a way to be sure I understand what the Vec -> Box<[]> path looks like. Anyway, if this commit doesn't fix it, I'm out of ideas personally and will have to ask someone more familiar with the allocator.

@kulp
Copy link
Author

kulp commented Mar 27, 2020

From a sample size of one, I declare the problem wholly eliminated :

$ cargo test
   Compiling bitvec v0.18.0 (https://github.com/myrrlyn/bitvec.git#3ffe9510)
   Compiling debug_bitbox v0.1.0 (/Users/kulp/Documents/Code/Rust/debug_bitbox)
    Finished test [unoptimized + debuginfo] target(s) in 13.15s
     Running target/debug/deps/debug_bitbox-f5adc7e290da82aa

running 1 test
test debug_bitbox ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

   Doc-tests debug_bitbox

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

Is there a short version of why you think this works, or at least why the old code was wrong ?

Thank you. I would close the issue, but perhaps you have your own process to follow.

@myrrlyn
Copy link
Collaborator

myrrlyn commented Mar 27, 2020

At first, I thought that my construction of a Vec in BitBox::drop was incorrect, because a zero-length Vec probably frees its allocation during Vec::into_boxed_slice, and I thought maybe a zero-length boxed slice was causing a double-free. When that commit didn't fix it, I realized that such a double-free would have been spotted way before now.

The second commit addresses a different fault: the allocator is not required to preserve the address when shrinking an allocation! The macOS allocator probably has an inclusive limit at 8064 words for its bucket size; the 8065-word vector, when shrinking, was moved to a smaller allocation and the large allocation was released back to the OS.

Line 655 used the original address, not the result of Vec::into_boxed_slice.

If you unwind a commit, I bet attempting to manipulate the BitBox will get you a use-after-free violation.

I'll close this when I release the 0.17.4 hotfix in about half an hour.

Thank you for the report and the investigation!

@kulp
Copy link
Author

kulp commented Mar 27, 2020

My current lack of nomicon-level Rust knowledge prevents me from completely assimilating the details of your explanation, but it remains plausible and soothing. Thanks for the quick turn-around.

@kulp
Copy link
Author

kulp commented Mar 28, 2020

I learned how to run Miri, and ran it on 0.17.3 out of interest. For me, Miri finds the problem only if at least 65 bits are reserved (probably because u64 is the underlying BitStore), but that is better than 8065.

More significantly, though, Miri finds the issue on my Linux virtual machine as well, where the issue had not previously manifested. Perhaps Miri could prove useful for vetting the whole of bitvec.

dingelish pushed a commit to mesalock-linux/bitvec-sgx that referenced this issue Apr 13, 2020
@myrrlyn myrrlyn closed this as completed Jun 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants