Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-add Stack Overflow handling in NativeAOT with larger alternate stack #95808

Closed
wants to merge 10 commits into from

Conversation

jtschuster
Copy link
Member

@jtschuster jtschuster commented Dec 8, 2023

It looks like the issue with the previous handling of stack overflow was that the alternate stack was too small. The guard page was included in the size of the stack when it should have been added to the size of the stack. This allocates an additional page for the alternate stack and seems to fix the issue for me.

I wasn't able to reproduce the issues in local builds, but I downloaded the exes from the failing CI tests and they segfaulted every 10 runs or so. After editing the binary to allocate more for the signal stack, I didn't see any more segfaults.

It looked like the issue was that we would hit a segfault and enter the SIGSEGV handler right as a GC pause was sending SIG34. Control would go to the GC signal handler first and would mess with the stack (I couldn't figure out exactly what it did though) and when control returned to the SIGSEGV handler, the first pushq in the method prelude caused another segfault and the program crashed.

@ghost
Copy link

ghost commented Dec 8, 2023

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

Issue Details

Want to test this in CI with a larger alternate stack

Author: jtschuster
Assignees: jtschuster
Labels:

area-NativeAOT-coreclr

Milestone: -

@jtschuster
Copy link
Member Author

/azp run runtime-nativeaot-outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jtschuster
Copy link
Member Author

/azp run runtime-nativeaot-outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jtschuster
Copy link
Member Author

/azp run runtime-nativeaot-outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jtschuster
Copy link
Member Author

/azp run runtime-nativeaot-outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jtschuster jtschuster marked this pull request as ready for review December 14, 2023 04:42
@janvorli janvorli closed this Jan 2, 2024
@janvorli janvorli reopened this Jan 2, 2024
@jtschuster
Copy link
Member Author

@MichalStrehovsky @jkotas @janvorli @VSadov Could you take a look when you have a chance?

@janvorli
Copy link
Member

I think that we cannot just run the hardware exception handler on the alternate stack like we do now. In coreclr, we switch off the alternate stack as soon as we figure out it is not a stack overflow. Here we don't, so if the catch handler ends up calling a deep call chain, we would get stack overflow.

@jtschuster jtschuster marked this pull request as draft January 13, 2024 02:07
@ghost ghost closed this Feb 12, 2024
@ghost
Copy link

ghost commented Feb 12, 2024

Draft Pull Request was automatically closed for 30 days of inactivity. Please let us know if you'd like to reopen it.

@github-actions github-actions bot locked and limited conversation to collaborators Mar 13, 2024
This pull request was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants