Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

oom_instead_of_bump_pointer_overflow fails on Debian armhf #128

Open
plugwash opened this issue Oct 30, 2021 · 6 comments
Open

oom_instead_of_bump_pointer_overflow fails on Debian armhf #128

plugwash opened this issue Oct 30, 2021 · 6 comments

Comments

@plugwash
Copy link
Contributor

I have discovered that in a Debian armhf environment the oom_instead_of_bump_pointer_overflow test fails. This was initially noticed on Debian QA infrastructure but I have also tested it locally on both 32-bit and 64-bit kernels and with both the Debian packaged version of bumpalo and the upstream git. All failed.

To troubleshoot the issue I added an extra print statement if the allocation unexpectedly succeeds.

     bump.alloc_layout(layout);
+    eprintln!("alloc_layout succeed unexpectly: p={} size={}", p, size);
 }

And got the following

32-bit kernel:
alloc_layout succeed unexpectly: p=3056602143 size=1238365153

64-bit kernel:
alloc_layout succeed unexpectly: p=4136635439 size=158331857

I think this is a case of a wrong assumption in the test. Specifically if the allocator choses to make the initial allocation near the top of the address space (which it is perfectly entitled to do) then the follow up allocation may be relatively small and may succeed. But I'd like an opinion from someone more familiar with the code before I go ahead and treat this as a broken test.

@fitzgen
Copy link
Owner

fitzgen commented Nov 1, 2021

Yes, this is an incorrect test.

It is true that this allocation will always cause the bump pointer to overflow inside try_alloc_layout_fast, but then we return None there and in try_alloc_layout we fallback to alloc_layout_slow which will allocate a new chunk and this can succeed.

I think we should

  • Change this test to allow either panic on OOM or successful allocation. Either is a valid behavior here, and we want to keep explicitly checking the code path where we overflow the bump pointer.
  • Add another test that explicitly allocates usize::MAX or something like that, and asserts that we get an OOM panic.

@plugwash
Copy link
Contributor Author

plugwash commented Nov 1, 2021

Yes, this is an incorrect test.

Thanks for confirming

It is true that this allocation will always cause the bump pointer to overflow inside

i'm not convinced that is true, the calculation in the test seems to assume that bumpalo will make allocations by incrementing the bump pointer, but it seems that allocations are actually made by decrementing the bump pointer.

@fitzgen
Copy link
Owner

fitzgen commented Nov 1, 2021

Ah good catch! This test is from before we made the increment-to-decrement switch and should also be updated for that.

@plugwash
Copy link
Contributor Author

plugwash commented Nov 1, 2021

I guess the remaining question is how should the test distinguish between an uncaught bump pointer overflow (bad) and a successful follow-up allocation (good).

And I think the answer to that is by inspecting the returned pointer and seeing if it is a feasible pointer for the allocation size in question.

@fitzgen
Copy link
Owner

fitzgen commented Nov 2, 2021

I don't think we need to do anything specific to detect/inspect the returned pointer. Instead, we just need to make sure we write a byte per page in the resulting allocation. We run tests under valgrind in CI and that (or even regular virtual memory protections) should catch any issues if we try to write out of bounds.

This is much easier to get 100% correct than "does this look like a valid pointer?" heuristics.

@plugwash
Copy link
Contributor Author

plugwash commented Nov 7, 2021

Writing a byte per page is not a good idea because the allocation may be HUGE and Linux overcommits memory, so the outcome of doing that may well be that your tests get killed by the OOM killer.

If you don't want to go down the route of checking the returned pointer for sanity then I think a better soloution may be to test "try_alloc_layout_fast" directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants