-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VB default ByVal marshalling behavior is actually VBByRefStr and it truncates MBCS input #9944
Comments
Updated in repro in C# public unsafe class Program
{
static void PrintBytes(string str)
{
byte[] bytes = new byte[str.Length * sizeof(char)];
System.Buffer.BlockCopy(str.ToCharArray(), 0, bytes, 0, bytes.Length);
foreach (var b in bytes)
{
Console.Write($"{b:x} ");
}
Console.WriteLine();
}
[DllImport(@"NativeLibrary.dll")]
extern static void Test2([MarshalAs(UnmanagedType.VBByRefStr)] ref string ff);
static void Main(string[] args)
{
string ff = "あいうえお";
PrintBytes(ff);
Test2(ref ff);
PrintBytes(ff);
}
} extern "C" void Test(char* s)
{
return;
} Bad output:
|
This underlying issue here isn't length but the lossy nature of character encoding conversions.
There are options here to fix this but I am dubious there is a lot of value in fixing this issue after so many years. |
@jeffschwMSFT @jkoritzinsky @elinor-fung Any thoughts here? My preference here would be to simply update the documentation about the nuances on VB.Net and mention MBCS strings so we can point people at something. I believe this kind of change should be pushed into the source generator approach for P/Invokes since it has so little impact given the number of reported issues. |
Now that the LibraryImport generator (i.e., source generated Support for marshalling customized types can be found in the new API described at #66121. |
VB defaults ByVal marshalling to VBByRefStr, which will truncate inputs that are MBCS. Here is the repro:
Suppose there is a DLL contains a function which requires char* argument as below.
If you call this function via P/Invoke with multi-bytes character as below, the argument is corrupted.
Instead of using the declaration above, any of the followings can avoid the corruption.
Here is the analysis:
ByValStr marshaling concept is flawed when it comes to marshaling multi-byte characters, such as Japanese and Chinese. It marshales it out to native code correctly – allocating enough buffer and marshal it as ANSI characters. However, when it marshal the string back into native, it assumes the length of the string is the managed length of the string, which is 5 in this case. As you’ve already guessed by now, this will result in a partial string being read/convert into managed and you’ll see a truncation in the middle of the string. This is a bug in the CLR.
Refer to 270066 for more details, moving to .NET Core due to the potential breaking nature of this request.
The text was updated successfully, but these errors were encountered: