Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix x86 stackwalk for UnmanagedCallersOnly #61075

Closed

Conversation

janvorli
Copy link
Member

@janvorli janvorli commented Nov 1, 2021

Yet another similar issue to the recently fixed x86 EH for methods
marked with UnmanagedCallersOnly, but called from a managed method using
a function pointer. The stack walker was asserting in
VerifyValidTransitionFromManagedCode. The REGDISPLAY values that it used
for checking the code location were not set for this case, so EIP and ESP
were both set to 0.
This change fixes it by adding UpdateRegDisplay method to
UMThkCallFrame.

Close #60648
Close #62728

Yet another similar issue to the recently fixed x86 EH for methods
marked with UnmanagedCallersOnly, but called from a managed method using
a function pointer. The stack walker was asserting in
VerifyValidTransitionFromManagedCode. The REGDISPLAY values that it used
for checking the code location were not set for this case, so EIP and ESP
were both set to 0.
This change fixes it by adding UpdateRegDisplay method to
UMThkCallFrame.
@janvorli janvorli added this to the 7.0.0 milestone Nov 1, 2021
@janvorli janvorli requested a review from jkotas November 1, 2021 23:25
@janvorli janvorli self-assigned this Nov 1, 2021
@jkotas
Copy link
Member

jkotas commented Nov 2, 2021

Is this a problem from class ComMethodFrame as well?

@janvorli
Copy link
Member Author

janvorli commented Nov 2, 2021

If it is possible to call an UnmanagedCallersOnly method via COM from a managed method so that there are no intermediate native frames, then it would probably suffer from the same issue. @AaronRobinsonMSFT can such a case exist?

@AaronRobinsonMSFT
Copy link
Member

AaronRobinsonMSFT commented Nov 2, 2021

If it is possible to call an UnmanagedCallersOnly method via COM from a managed method so that there are no intermediate native frames, then it would probably suffer from the same issue. @AaronRobinsonMSFT can such a case exist?

@janvorli How about an example of what I think you are saying. Is the following the question? If so, this is completely reasonable code.

using System;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;

unsafe class Program
{
    struct Vtable
    {
        public delegate* unmanaged[Stdcall]<Inst*, int> Method1;
    }
    
    struct Inst
    {
        public Vtable* vptr;
    }
    
    [UnmanagedCallersOnly(CallConvs = new[] {typeof(CallConvStdcall)})]
    static int MethodImpl(Inst* p) => throw new Exception();
    
    static void Main()
    {
        var vptr = (Vtable*)RuntimeHelpers.AllocateTypeAssociatedMemory(typeof(Program), sizeof(Vtable));
        vptr->Method1 = &MethodImpl;
        var inst = new Inst() {vptr = vptr};
        
        // Imagine inst was ambiguous - could be a real COM object
        // or a managed COM wrapper we don't know about.
        inst.vptr->Method1(&inst);
    }
}

@janvorli
Copy link
Member Author

janvorli commented Nov 2, 2021

@AaronRobinsonMSFT my question was about real COM calls that use ComMethodFrame. I assume your example doesn't end up using that frame, does it?
I'll try to rephrase my question a bit. The issue that I am fixing here occurs only when a managed method directly calls another managed method marked with UnmanagedCallersOnly and it does that via a function pointer. Then there is no native frame between the caller and callee frames. @jkotas asked me if it is possible that the same case (no native frame between the caller and callee frames) happens when a managed method directly calls another managed method via COM. That would mean that the COMMethodFrame would be created inside of the managed caller.

@jkotas
Copy link
Member

jkotas commented Nov 2, 2021

Hmmm, I am actually wondering how the fix in UMThkCallFrame is related to UnmanagedCallersOnly methods. UMThkCallFrame should not be involved at all in UnmanagedCallersOnly methods. UMThkCallFrame should be only involved in marshaled reverse-PInvoke delegates.

What is the exact set of conditions that triggers the bug?

@jkotas
Copy link
Member

jkotas commented Nov 2, 2021

Here is the callstack that triggers the problem:

...
coreclr!WKS::GCHeap::WaitUntilGCComplete+0x4f [D:\a\_work\1\s\src\coreclr\gc\gcee.cpp @ 287] 
...
coreclr!JitILStub+0x9e [D:\a\_work\1\s\src\coreclr\vm\dllimport.cpp @ 5598] 
coreclr!UMThunkMarshInfo::RunTimeInit+0xd2 [D:\a\_work\1\s\src\coreclr\vm\dllimportcallback.cpp @ 480] 
coreclr!UMEntryThunk::RunTimeInit+0xab [D:\a\_work\1\s\src\coreclr\vm\dllimportcallback.h @ 182] 
coreclr!UMEntryThunk::DoRunTimeInit+0xde [D:\a\_work\1\s\src\coreclr\vm\dllimportcallback.cpp @ 282] 
0x3233087 // UMEntryPrestub
Microsoft_Interop_Tests_NativeExports!NativeExports.DelegatesAndFunctionPointers.InvokeCallbackAfterGCCollect(Void ())+0x6d
...

As expected, there is no UnmanagedCallersOnly method. It is a prestub for marshaled delegate that triggers the problem.

I think we will get into the same situation with COM. UMEntryPrestub is going to be GenericComCallStub, UMThkCallFrame is going to be ComMethodFrame, and the rest of the structure is going to be similar.

@janvorli
Copy link
Member Author

janvorli commented Nov 2, 2021

I am sorry for confusing the explanation. But I believe the UnmanagedCallersOnly is what's triggering the problem. The InvokeCallbackAfterGCCollect is marked with this attribute and the method it calls is invoked via an unmanaged function pointer. When the function pointer is called, the UMEntryPrestub is invoked, the UMTkhCallFrame is created in that prestub and we go down the way of jitting a related IL stub.
That's the point where the stack walk walks the stack and hits the problem. It gets to the UMTkhCallFrame and then unwinds to the caller, which is a managed method. The fact that it is a managed method triggers the issue.

@janvorli
Copy link
Member Author

janvorli commented Nov 4, 2021

@jkotas actually, let me correct myself. After sorting this in my head again, I have found I've confused it in my previous comment and I was right in my original description. The issue is really caused by the fact that we call a managed function marked using the UnmanagedCallersOnly via a native function pointer. If the target of the pointer was a native method, there would be no UMThkCallFrame. And it doesn't matter whether the managed caller of the function pointer is marked with UnmanagedCallersOnly or not.
But I think you are right with the ComMethodFrame having the same problem.

#undef CALLEE_SAVED_REGISTER

pRD->ControlPC = *PTR_PCODE(pRD->PCTAddr);
UINT cbStackPop = GetMethod()->CbStackPop();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am reading the code correctly, GetMethod is going to return the managed method that the UMThunk or ComMethodFrame is wrapping.

Is it the managed method to ask for CbStackPop? I suspect we need to CbStackPop for the unmanaged signature with unmanaged view for the arguments here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yes, you are right, I have forgotten that the unmanaged view can be different.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janvorli Do we have a test for this? If you can offer a simple strawman example the Interop team can help with adding the appropriate testing assets.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AaronRobinsonMSFT to test this fully, we would need:

  • A COM test that exposes a managed method via COM and calls it from other managed method
  • That method would need to have arguments passed on stack
  • The unmanaged signature would need to differ from the managed one in the size of arguments passed on stack.

It would be great if someone from the interop team can help me with creating that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AaronRobinsonMSFT I also have a question about getting the unmanaged signature for COM method. Would the right one be the one returned by the ComCallMethodDesc::GetCallMethodDesc?

@mangod9
Copy link
Member

mangod9 commented Jan 10, 2022

@janvorli is this ok to merge?

@janvorli
Copy link
Member Author

@mangod9 no, as @jkotas pointed out above, the MethodDescs that I was using for the signatures need to reflect the caller ones, not the callee ones that I am using there now. I need to figure out where to get the proper ones yet.

@AaronRobinsonMSFT
Copy link
Member

@mangod9 This is sort of blocked on me. I am investigating now so I can provide the clarity for @janvorli.

@mangod9
Copy link
Member

mangod9 commented Jan 10, 2022

ok, thanks for the update.

@janvorli
Copy link
Member Author

@jkotas @AaronRobinsonMSFT helped me to look more into the ComMethodFrame. There is only a narrow case of COM implementation where this frame is used and we have found that it cannot occur in case the target is marked with UnmanagedCallersOnly. So I am going to revert the ComMethodFrame related part.

@janvorli janvorli force-pushed the fix-x86-stackwalk-unmanagedcallersonly branch from 38bf7d3 to 4e50993 Compare January 12, 2022 20:30
It would be difficult and useless to attempt to correctly handle the
case when the caller is a native code. Getting the stack arguments
size for the caller seems to be difficult, since in some cases,
we reach the UpdateRegDisplay before the reverse pinvoke IL stub and
even its MethodDesc are created.

The case that we care about is simpler, as the caller side and callee
side signatures are the same, so we can extract the stack arguments size
using the callee MethodDesc.
@janvorli
Copy link
Member Author

@jkotas I was trying to find a way to get the callee side signature for getting the correct stack arguments size. I have realized that for our problematic case, the caller and callee view of the signatures are the same and we don't need to use the UpdateRegDisplay for the other cases. It would be difficult to attempt to correctly handle the case when the caller
is a native code. Getting the stack arguments size for the caller in that case seems to be difficult, since in some cases,
we reach the UpdateRegDisplay before the reverse pinvoke IL stub and even its MethodDesc are created. And the callee
side MethodDesc is determined by quite an involved code.

So I have modified my change to let the UpdateRegDisplay do its work only for the case when the caller is a managed method.

@jkotas
Copy link
Member

jkotas commented Jan 15, 2022

the caller and callee view of the signatures are the same

It is just a pure accident that the callback has delegate void VoidVoid() signature and that the managed and unmanaged CbStackPop is zero for both cases.

If the signature of the callback was delegate void VoidInt(int x), the managed and unmanaged CbStackPop values would not match. The CbStackPop computed by the code in this PR would be 0, but the correct CbStackPop to use would be 4 - assuming the callback has stdcall calling convention.

Also, I am puzzled by the check for managed caller.

Getting the stack arguments size for the caller in that case seems to be difficult, since in some cases,
we reach the UpdateRegDisplay before the reverse pinvoke IL stub

Yes, I agree. I think we would have to compute ahead of time in LoadTimeInit to make this work.

@janvorli
Copy link
Member Author

If the signature of the callback was delegate void VoidInt(int x), the managed and unmanaged CbStackPop values would not match. The CbStackPop computed by the code in this PR would be 0, but the correct CbStackPop to use would be 4 - assuming the callback has stdcall calling convention.

When a managed caller calls a managed method marked by the UnmanagedCallersOnly attribute using an unmanaged function pointer, I don't see how the signatures and calling conventions could be different.

Also, I am puzzled by the check for managed caller.

That's the only case when the problem can occur. The stack walker walks to the UMThkCallFrame and before my change, it worked only if the caller was a native code in such a case, so the ProcessIp(adr) doesn't end up setting m_crawl.isFrameless.

@jkotas
Copy link
Member

jkotas commented Jan 15, 2022

When a managed caller calls a managed method marked by the UnmanagedCallersOnly attribute using an unmanaged function pointer, I don't see how the signatures and calling conventions could be different.

MethodDesc::CbStackPop does not return the correct amount for methods marked with UnmanagedCallersOnly attribute. MethodDesc::CbStackPop assumes managed calling convention. It does not look at the unmanaged method calling convention.

@jkotas
Copy link
Member

jkotas commented Jan 15, 2022

The stack walker walks to the UMThkCallFrame

UMThkCallFrame is only used for marshaled delegates. It is not used for plain vanilla UnmanagedCallersOnly methods. Delayed initialization of plain vanilla UnmanagedCallersOnly methods goes through PreStubWorker_Preemptive that does not have any Frame.

That makes me think: Can this problem be fixed by deleting UMThkCallFrame?
It is only used on x86 and only for marshaled delegates. Deleting UMThkCallFrame would switch the marshaled delegates to the same plan as plain vanilla UnmanagedCallersOnly methods that seem to work just fine.

@jkotas
Copy link
Member

jkotas commented Jan 15, 2022

Can this problem be fixed by deleting UMThkCallFrame?

#63826 . Let's see what the CI says about it.

@janvorli
Copy link
Member Author

That's great! The CI seems to be happy with it.

@jkotas
Copy link
Member

jkotas commented Jan 17, 2022

Replaced by #63826

@jkotas jkotas closed this Jan 17, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Feb 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
5 participants