-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sub-optimal codegen with C# switch
statement handling for some simple cases
#10634
Comments
That's a side effect of using |
Also, it looks like the presence of |
Yep, IL |
I've tried with |
Inlining methods with switches is not supported by the default inlining policy, but other policies allow it, so some level of support is plumbed through. Would be simple enough to also enable it by default, perhaps core-only for now. Given how conservative the default inline policy is, it would also likely require aggressive inlining attribution, though very small switches might make it past the profitability screen. Just need to split the Would also be nice to track if when args/constants reach switches and give a boost to the profitability metrics. |
Related question - is it somehow possible to switch to other inlining policy thru say app.config or alike? Would be nice for my hi-perf code. |
IMO that's perfectly reasonable as switches can be relatively large. What's not reasonable is that methods containing switches do not inline even if aggressive inlining is specified, that seems like a completely artificial limitation.
I did a quick test - I simply removed the observation. This resulted in one method being inlined in corelib - cmp esi, 3
ja SHORT G_M61114_IG11
mov eax, esi
lea rdx, [reloc @RWD00]
mov edx, dword ptr [rdx+4*rax]
lea rcx, G_M61114_IG02
add rdx, rcx
jmp rdx
G_M61114_IG08:
mov eax, dword ptr [rdi+4]
shr eax, 16
movzx rax, ax
test eax, eax
setne al
movzx rax, al
test eax, eax
jne SHORT G_M61114_IG12
jmp SHORT G_M61114_IG11
G_M61114_IG09:
cmp word ptr [rdi+4], 0
setne al
movzx rax, al
test eax, eax
jne SHORT G_M61114_IG12
jmp SHORT G_M61114_IG11
G_M61114_IG10:
cmp word ptr [rdi+4], 1
seta al
movzx rax, al
test eax, eax
jne SHORT G_M61114_IG12
G_M61114_IG11: |
@voinokin nothing officially supported, no. You can experiment with an alternate policy I created called the There are other policies available if you do a custom build of the jit, but they are mainly there to enable stress testing or experimental studies. |
@mikedn Roslyn's switch handling aims for well-packed jump tables; I think it will always pick a switch over ifs when it can get three+ contiguous entries. I experimented with a number of different configurations trying to see if I could find a way to write this that the JIT would inline without telling it to be aggressive: public static bool TryParseBaseline(ReadOnlySpan<byte> source, out int value, out int bytesConsumed, char standardFormat = default)
{
switch (standardFormat)
{
case default(char):
case 'g':
case 'G':
case 'd':
case 'D':
return TryParseInt32D(source, out value, out bytesConsumed);
case 'n':
case 'N':
return TryParseInt32N(source, out value, out bytesConsumed);
case 'x':
case 'X':
value = default;
return TryParseUInt32X(source, out Unsafe.As<int, uint>(ref value), out bytesConsumed);
default:
return ThrowHelper.TryParseThrowFormatException(out value, out bytesConsumed);
}
} It was always rejected for too many IL bytes or too many basic blocks, even when I coerced Roslyn into emitting switch IL for parts of it (and other variations were considered unprofitable), so banning switches from inlining based on expected size seems to be pretty redundant. It also seems somewhat counter-productive - I suspect a significant portion of switch statements (at least at the C# syntax level - in IL iterator blocks probably significantly influence the frequency) are like this one where the common case by far is the caller providing a constant for an argument being switched on. |
Something like that, it looks like it goes after > 50% switch table density - https://github.com/dotnet/roslyn/blob/d4dab355b96955aca5b4b0ebf6282575fad78ba8/src/Compilers/Core/Portable/CodeGen/SwitchIntegralJumpTableEmitter.cs#L65-L94 It's debatable, in some cases it should probably go for less, especially if the resulting jump table wouldn't be too large. It would be easier for the JIT to look at a jump table and decide switch strategy to use than to analyze the binary search tree code that is otherwise produced.
Yeah, it's a bit arbitrary. It could be that there was (and perhaps still is) a technical limitation in the JIT importer around inserting switch basic blocks back into the inliner method. But it seems more likely that someone simply decided that switches tend to be large so it's not worth the trouble. And when |
The check for
switch
statement input value range is duplicated in JIT for some simple cases.Consider the following repro code:
IL produced:
Disasm:
category:cq
theme:inlining
skill-level:intermediate
cost:medium
The text was updated successfully, but these errors were encountered: