-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT Hints- Parameter conditional inlining directives, or, improve call-site inlining of funcs with constant args feeding structs #10658
Comments
The JIT inline heuristic already looks at constant arguments but only in connection to branches, not switches. Basically |
The body of that function doesn't have any actual switch statements in it (since the cases don't line up to anything remotely close to a decent jump table); as written currently it gets excluded for too many IL bytes, but it also has too many basic blocks. I was able to prevent it from being excluded from inlining outright by breaking it up into two functions (as in the linked issue), but the inner function still wasn't getting inlined (deemed unprofitable) - I don't know if the constant argument value wasn't propagated to the consideration of the inner function, or even with the constant the heuristic wasn't scoring it high enough. Regardless, while the ideal certainly would be the JIT always figuring out the right choice in cases like this (and at no runtime cost, and also it should do my taxes and give me a pony while it's at it), we're definitely not there yet. |
Yes, the JIT does take constant arguments into consideration but it's not a carte blanche for inlining so other factors such a method size still contribute to the final score. Also, the JIT does not attempt to do any actual optimizations on neither the inliner nor the inlinee at the time of inlining. It's just "hey, I see a constant argument used in an |
That's not entirely true -- we do a bit of optimization on the inlinee. The jit will retype arguments and forward substitute argument expressions into the inlinee, to make propagated constants available to nested calls and to enable things like devirtualization. As I've noted elsewhere the initial analysis of the inlinee as an inline candidate is very crude and based on a simple pass over the inlinee's IL. Because of this it is hard to project how much size savings might arise from particulars at a call site (doing this for switches is fairly tricky business anyways, even with more involved modelling). And the stack model we use when analyzing the IL is quite crude and will both miss cases and overstate cases. The simple modelling and the underlying conservationism of the inliner are hallmarks of the historical effort to balance jit throughput versus code quality. When you have one shot at jitting a method you can't afford to go overboard in either direction. The hope is that tiering will eventually allow us to relax some of the throughput constraints and that upper tier jitting can afford more expansive modelling, but we're not quite there yet. The remainder of the jit isn't well set up for larger method bodies yet either. So we have to work at improving it bit by bit. |
And for every function like this example which can be entirely jitted away, there are probably two dozen others that are just checking for valid input. I'm quite excited by the rather significant progress I see being made in the right direction, but for where we are right now, better annotations seem like the more reasonable ask (and better use of resources) than figuring out the perfect balance of detecting the right choice. |
Here's a prototype change to unblock inlining methods with switches, and a bit of work on a simple profitability estimate: master...AndyAyersMS:Explore18863. Running PMI diffs over the framework assemblies jits around 170K methods. Over all the calls in these methods, the jit sees about 180 call sites where constant caller arguments feeds a switch in the callee and inlines at maybe 50 of them:
So at least in framework code this pattern is really not very common. If you know of any code where it might show up more frequently, please send me pointers. As an example of what can happen: in |
Running a search against a large codebase I work on (~1GB of assemblies)... Roughly 1/3 of C# In that 3%, every one I looked at was effectively a lookup table for an enum, e.g. public bool NeedsFrobulating(MyEnum value)
{
switch(value)
{
case MyEnum.ValueA: return true;
case MyEnum.ValueC: return true;
default: return false;
}
} Don't know how many of them have callsites with constants yet, but a lot of them would likely benefit from inlining regardless. Aside from that, there were a large number of generated |
If you can run on a locally built CoreCLR, feel free to use my fork above, it prints "@@@@ CONSTANT FEEDS SWITCH @@@@" to stdout if the jit sees the constant case. You can get jfull jit coverage over your assemblies via PMI, see the jitutils repo via something like this:
For example:
If you find this useful I can augment this with the names of the root method, inline parent, and inline candidate. |
I went and took another look at this, and things are a bit more complicated now.
So to see the potential size savings a constant argument offers, the jit would need to realize the input was indeed constant (currently there is no real concept in the jit of a "constant struct") and then aggregate the potential benefit of the constituent constant fields as they feed a series of tests against other known constants in the inlinee, and get all of the accounting right (including cases that rejoin or branch to one another, etc). That is well beyond what we can do today. Amd my prototype changes above don't help as there is no switch in the IL. We currently aggressively inline For non-default constant cases the IR growth before the winning path becomes clear is probaby even worse than for the default case, as the jit would start aggressively inlining down through the default case code before realizing (one hopes; I have not looked) all that inlined code is uneachable and can be tossed out. There's always a risk that pulling in a bunch of extra IR like this may confuse some analysis in the jit or cause it to hit one of its internal tripwires. From the jit's standpoint things would have worked out more smoothly if there were two separate int or byte option fields as constants; these would feed early branch folding in the importer and guide the jit to only import at the code that is reachable. If these kind of "constant" struct option bundles become popular we may need to look for ways to recognize them earlier and act on them sooner. |
In dotnet/corefx#30934 we have this code:
If the
standardFormat
parameter has a constant value at the callsite, we definitely want this function inlined; it can be entirely evaluated by the JIT and inlined to a call slightly smaller than the original. But if it's not, it's not a particularly small function to be inlining; it's going to be the wrong choice in at least some cases.[MethodImpl(MethodImplOptions.AggressiveInlining)]
is a rather blunt tool for this kind of scenario. It would be much better if we could do something likepublic static bool TryParse(ReadOnlySpan<byte> source, out int value, out int bytesConsumed, [ParamImpl(ParamImplOptions.InlineMethodWhenConstant)] char standardFormat = default)
to give more nuance to the JIT hints.category:cq
theme:inlining
skill-level:expert
cost:extra-large
The text was updated successfully, but these errors were encountered: