-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
null-check the redirect context before using. #65910
Conversation
} | ||
|
||
// We may not have a preallocated context. Could be short on memory when we tried to preallocate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The crash was hit in the CI by a lot of tests. It sounds very unlikely we would be running out of memory that often.
Isn't the actual problem a race condition somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern as well. The situation should not be fatal, but should not be common either. These tests do not look like something that should run close to OOM.
I am not sure if this was always happening and handled or it is something new.
I could not reproduce this on purpose. I've spent quite some time trying. Maybe having fewer resources on lab machines is a factor.
I think the fix makes sense one way or another, but why we need it more than I'd expect is still a question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was unable to reproduce this so far. Did we see this on anything other than Windows7/x86?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have seen it on Win7/x86 only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We call InitializeContext
with 0 for context buffer size to get the actual buffer size. The expected result is to get ERROR_INSUFFICIENT_BUFFER
and to get the size required.
Rarely, for unknown reasons we get a different error. We have disabled an assert because of this.
runtime/src/coreclr/vm/threadsuspend.cpp
Line 1986 in 1857d04
// The following assert is valid, but gets triggered in some Win7 runs with no impact on functionality. |
I wonder if in such case we also get invaid size (like 0) and thus the allocation fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spec apparently mentions that:
If the function fails with an error other than ERROR_INSUFFICIENT_BUFFER, the contents of ContextLength are undefined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just do not know what can possibly go wrong with a call like InitializeContext(NULL, context, NULL, &contextSize);
- that is the same use as in the example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should not even try to allocate when InitializeContext
returns unexpected results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps that is the root cause, but the reason why the call may fail on Win7/x86 is still unclear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks!! |
* null-check the redirect context before using. * tweak the comment * do not allocate context if InitializeContext has unexpected results.
* null-check the redirect context before using. * tweak the comment * do not allocate context if InitializeContext has unexpected results.
…#66120) * Use CopyContext to restore saved context on X86 (#65490) * Use CopyContext to restore saved context on X86 * PR feedback * more PR feedback * Do not copy XState other than AVX when redirecting for GC stress (#65825) * Do not copy XState other than AVX * #if defined(TARGET_X86) || defined(TARGET_AMD64) * mask XState unconditionally * Ensure XSTATE_MASK_AVX is set before calling EEGetThreadContext * redundant supportsAVX, more clear comment * PR feedback * null-check the redirect context before using. (#65910) * null-check the redirect context before using. * tweak the comment * do not allocate context if InitializeContext has unexpected results. * EE Suspension on x86 should use RtlRestoreContext when available (#65878) * RestoreContextSimulated * probe for RtlRestoreContext * ntdll.dll * restore self-trap sequence * PR feedback * Clarify CopyContext in RedirectedHandledJITCaseExceptionFilter * simpler indentation. * restore last error on the legacy path. * Update src/coreclr/vm/threads.h Co-authored-by: Dan Moseley <[email protected]> Co-authored-by: Dan Moseley <[email protected]>
* Use CopyContext to restore saved context on X86 (#65490) * Use CopyContext to restore saved context on X86 * PR feedback * more PR feedback * Do not copy XState other than AVX when redirecting for GC stress (#65825) * Do not copy XState other than AVX * #if defined(TARGET_X86) || defined(TARGET_AMD64) * mask XState unconditionally * Ensure XSTATE_MASK_AVX is set before calling EEGetThreadContext * redundant supportsAVX, more clear comment * PR feedback * null-check the redirect context before using. (#65910) * null-check the redirect context before using. * tweak the comment * do not allocate context if InitializeContext has unexpected results. * EE Suspension on x86 should use RtlRestoreContext when available (#65878) * RestoreContextSimulated * probe for RtlRestoreContext * ntdll.dll * restore self-trap sequence * PR feedback * Clarify CopyContext in RedirectedHandledJITCaseExceptionFilter * simpler indentation. * restore last error on the legacy path. * Update src/coreclr/vm/threads.h Co-authored-by: Dan Moseley <[email protected]> Co-authored-by: Vladimir Sadov <[email protected]> Co-authored-by: Dan Moseley <[email protected]>
Fixes: #65890
In rare cases (typically on x86) the redirect context can be
NULL
. It is supposed to be preallocated before redirecting a thread, but we could be short on memory at the time when we tried to allocate.The condition should not be fatal. It should just cause the current redirection attempt to fail and cause a retry.
On the next attempt we may be able to allocate or stop the thread via other mechanisms.
We used to indirectly rely on
EEGetThreadContext
to fail when the context isNULL
. Now the first use isSetXStateFeaturesMask
which crashes with NRE.We should just check
pCtx
forNULL
before using it.