Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't expose guard pages to malloc_stack API consumers #54591

Merged
merged 1 commit into from
Jun 9, 2024

Conversation

fingolfin
Copy link
Contributor

Whether or not a guard page is in effect is an implementation detail and consumers of the malloc_stack API should not have to worry about that. In particular, if a stack of a certain size is requested, a stack of that size should be delivered, and not be reduced on some systems because we park a guard page in that range.

This also helps consumers of the gcext API implementing stack scanning (i.e., GAP.jl), as it does not have to worry about running into those guard pages anymore.

Ultimately it would be fantastic if this could be backported to 1.11 and perhaps even 1.10, as it helps me avoid a real-world bug (I verified that this works, the cherry-picking needs some manual work but it's easy and I can provide it). At the same time, I could see that people might be worried about such a change in an 1.10.x release. So if a backport to one or both of these is not deemed desirable, I could provide an alternative patch for those which only modifies jl_active_task_stack to deal with the guard pages -- since that is part of gcext and not used by Julia itself, that'd be less risky. In fact that was exactly what my first implementation did, but I eventually realized that instead of working around a leaky abstraction, it would probably be better to fix the abstraction.

CC @benlorenz who may have additional insights

(This PR overlaps very slightly with #54568 so merging either one will require the other to be adjusted; of course I am happy to take care of that).

Whether or not a guard page is in effect is an implementation
detail and consumers of the `malloc_stack` API should not have to
worry about that. In particular, if a stack of a certain size is
requested, a stack of that size should be delivered, and not be
reduced on some systems because we park a guard page in that
range.

This also helps consumers of the gcext API implementing stack
scanning (i.e., GAP.jl), as it does not have to worry about
running into those guard pages anymore.
@fingolfin fingolfin requested a review from vtjnash May 28, 2024 07:39
@fingolfin fingolfin added the backport 1.11 Change should be backported to release-1.11 label May 31, 2024
@KristofferC KristofferC mentioned this pull request Jun 4, 2024
60 tasks
Copy link
Member

@Keno Keno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me

@fingolfin fingolfin merged commit 5dfd57d into JuliaLang:master Jun 9, 2024
8 checks passed
@fingolfin fingolfin deleted the mh/guard_pages branch June 9, 2024 22:45
KristofferC pushed a commit that referenced this pull request Jun 13, 2024
Whether or not a guard page is in effect is an implementation detail and
consumers of the `malloc_stack` API should not have to worry about that.
In particular, if a stack of a certain size is requested, a stack of that
size should be delivered, and not be reduced on some systems because we
park a guard page in that range.

This also helps consumers of the gcext API implementing stack scanning
(i.e., GAP.jl), as it does not have to worry about running into those
guard pages anymore.

(cherry picked from commit 5dfd57d)
KristofferC added a commit that referenced this pull request Jun 25, 2024
Backported PRs:
- [x] #54361 <!-- [LBT] Upgrade to v5.9.0 -->
- [x] #54474 <!-- Unalias source from dest in copytrito -->
- [x] #54548 <!-- Fixes for bitcast bugs with LLVM 17 / opaque pointers
-->
- [x] #54191 <!-- make `AbstractPipe` public -->
- [x] #53402 <!-- Add `jl_getaffinity` and `jl_setaffinity` -->
- [x] #53356 <!-- Rename at-scriptdir project argument to at-script and
search upwards for Project.toml -->
- [x] #54545 <!-- typeintersect: fix incorrect innervar handling under
circular env -->
- [x] #54586 <!-- Set storage class of julia globals to dllimport on
windows to avoid auto-import weirdness. Forward port of #54572 -->
- [x] #54587 <!-- Accomodate for rectangular matrices in `copytrito!`
-->
- [x] #54617 <!-- CLI: Use `GetModuleHandleExW` to locate libjulia.dll
-->
- [x] #54605 <!-- Allow libquadmath to also fail as it is not available
on all systems -->
- [x] #54634 <!-- Fix trampoline assembly for build on clang 18 on apple
silicon -->
- [x] #54635 <!-- Aggressive constprop in trevc! to stabilize triangular
eigvec -->
- [x] #54645 <!-- ensure we set the right value to gc_first_tid -->
- [x] #54554 <!-- make elsize public -->
- [x] #54648 <!-- Construct LazyString in error paths for tridiag -->
- [x] #54658 <!-- fix missing uuid check on extension when finding the
location of an extension -->
- [x] #54594 <!-- Switch to Pkg mode prompt immediately and load Pkg in
the background -->
- [x] #54669 <!-- Improve error message in inplace transpose -->
- [x] #54671 <!-- Add boundscheck in bindingkey_eq to avoid OOB access
due to data race -->
- [x] #54672 <!-- make: Fix `sed` command for LLVM libraries with no
symbol versioning -->
- [x] #54624 <!-- more precise aliasing checks for SubArray -->
- [x] #54679 <!-- 🤖 [master] Bump the Distributed stdlib from 6a07d98 to
6c7cdb5 -->
- [x] #54604 <!-- Fix tbaa annotation on union selector bytes inside of
structs -->
- [x] #54690 <!-- Fix assertion/crash when optimizing function with dead
basic block -->
- [x] #54704 <!-- LazyString in reinterpretarray error messages -->
- [x] #54718 <!-- fix prepend StackOverflow issue -->
- [x] #54674 <!-- Reimplement dummy pkg prompt as standard prompt -->
- [x] #54737 <!-- LazyString in interpolated error messages involving
types -->
- [x] #54642 <!-- Document GenericMemory and AtomicMemory -->
- [x] #54713 <!-- make: use `readelf` for LLVM symbol version detection
-->
- [x] #54760 <!-- REPL: improve prompt! async function handler -->
- [x] #54606 <!-- fix double-counting and non-deterministic results in
`summarysize` -->
- [x] #54759 <!-- REPL: Fully populate the dummy Pkg prompt -->
- [x] #54702 <!-- lowering: Recognize argument destructuring inside
macro hygiene -->
- [x] #54678 <!-- Don't let setglobal! implicitly create bindings -->
- [x] #54730 <!-- Fix uuidkey of exts in fast path of `require_stdlib`
-->
- [x] #54765 <!-- Handle no-postdominator case in finalizer pass -->
- [x] #54591 <!-- Don't expose guard pages to malloc_stack API consumers
-->
- [x] #54755 <!-- [TOML] remove Dates hack, replace with explicit usage
-->
- [x] #54721 <!-- add try/catch around scheduler to reset sleep state
-->
- [x] #54631 <!-- Avoid concatenating LazyString in setindex! for
triangular matrices -->
- [x] #54322 <!-- effects: add new `@consistent_overlay` macro -->
- [x] #54785
- [x] #54865
- [x] #54815
- [x] #54795
- [x] #54779
- [x] #54837 

Contains multiple commits, manual intervention needed:
- [ ] #52694 <!-- Reinstate similar for AbstractQ for backward
compatibility -->
- [ ] #54649 <!-- Less restrictive copyto! signature for triangular
matrices -->

Non-merged PRs with backport label:
- [ ] #54779 <!-- make recommendation clearer on manifest version
mismatch -->
- [ ] #54739 <!-- finish implementation of upgradable stdlibs -->
- [ ] #54738 <!-- serialization: fix relocatability bug -->
- [ ] #54574 <!-- Make ScopedValues public -->
- [ ] #54457 <!-- Make `String(::Memory)` copy -->
- [ ] #53957 <!-- tweak how filtering is done for what packages should
be precompiled -->
- [ ] #53452 <!-- RFC: allow Tuple{Union{}}, returning Union{} -->
- [ ] #53286 <!-- Raise an error when using `include_dependency` with
non-existent file or directory -->
- [ ] #51479 <!-- prevent code loading from lookin in the versioned
environment when building Julia -->
@KristofferC KristofferC removed the backport 1.11 Change should be backported to release-1.11 label Jun 25, 2024
vtjnash added a commit that referenced this pull request Aug 21, 2024
vtjnash added a commit that referenced this pull request Aug 29, 2024
Reverts #54591

This cause the runtime to misbehave and crash, since all of the
consumers of this information in the runtime assumed that the guard
pages are accounted for correctly as part of the reserved allocation.
Nothing in the runtime ever promised that it is valid to access the
pages beyond the current redzone (indeed, ASAN would forbid it as well).
@giordano giordano added the reverted This PR has since been reverted label Aug 29, 2024
KristofferC pushed a commit that referenced this pull request Sep 9, 2024
KristofferC pushed a commit that referenced this pull request Sep 12, 2024
Reverts #54591

This cause the runtime to misbehave and crash, since all of the
consumers of this information in the runtime assumed that the guard
pages are accounted for correctly as part of the reserved allocation.
Nothing in the runtime ever promised that it is valid to access the
pages beyond the current redzone (indeed, ASAN would forbid it as well).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
reverted This PR has since been reverted
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants