Compile-Time method inlining #1413
Replies: 61 comments 6 replies
-
What are those cases? |
Beta Was this translation helpful? Give feedback.
-
@svick Inlining in generial helps to reduce a "call" overheads in a performance critical code. Performance critical methods fequently are very complicated and contains many code duplication. Smart inining helps to reduce code duplication and avoid calls overheads. You can find 100500+ examples in the CoreFX where inline can be useful. (Also you can open any C++ project and find the "inline" directive and code injected through "#DEFINE"). |
Beta Was this translation helpful? Give feedback.
-
The team hasn't seemed terribly interested in taking up language proposals which involve specific optimizing at the compilation step. I believe that this proposal would fall under that umbrella. This is something better probably suited to a post-compilation link/optimization phase, where any language would immediately benefit rather than just C#. I'm going to assume that you're proposal isn't just to inline "DoLock" but also to inline any potential delegate being passed to "DoLock". This feels like something that would quickly be approaching compile-time templates. |
Beta Was this translation helpful? Give feedback.
-
I'll just mention the slightly related issues: #1028 They are more concerned with compile-time expressions and constants and less inlining ..... but it goes into the same general direction, so I am mentioning them solely for completeness.... |
Beta Was this translation helpful? Give feedback.
-
@HaloFour, I agree that code-templating can be performed as a file-postprocess step or at IL level. And every approach has it's props and cons.
In any case debugging experience probably need to be improved with this sort of optimization. |
Beta Was this translation helpful? Give feedback.
-
In general, I do think the CLR is best equipped for this kind of optimization, even if it can't do it today (like delegate inlining). I think what you're proposing would require a significant amount of effort, and a lot of it would be just duplicating something the CLR can already do. That being said, I think that if you want this to happen, you would need to demonstrate really well why Roslyn is the right place to do it. What makes adding all of this to Roslyn easier than improving the CLR? Also:
According to this Stack Overflow answer, C++ compilers pretty much ignore |
Beta Was this translation helpful? Give feedback.
-
Being in Roslyn seems strange for me. If you added this attribute to an F# method, it would not be inlined... which seems non-sensical. |
Beta Was this translation helpful? Give feedback.
-
A little compile-time inlining here, a few binding redirects there... what could possibly go wrong? 😇 |
Beta Was this translation helpful? Give feedback.
-
@CyrusNajmabadi I would expect that if something can be inlined, it cannot be part of the assembly's public API surface. That said... doesn't the compiler already try to inline Local Functions? Would it be a huge amount of work to extend this to other |
Beta Was this translation helpful? Give feedback.
-
The C# compiler doesn't attempt to inline local functions. Even if all they do is return a constant the compiler still fully emits the static method and invokes it. The CLR can inline them, tho. |
Beta Was this translation helpful? Give feedback.
-
I don’t understand why you aren’t requesting that the JIT lift whatever restrictions you find excessive. The JIT already knows how to inline. |
Beta Was this translation helpful? Give feedback.
-
What would be the language semantics of the proposed attributes? If there are none, then this is a compiler feature request and does not belong in this repo. |
Beta Was this translation helpful? Give feedback.
-
@gafter This proposal should not extend syntax or modify any symantic of the language, and probably yes it can be a compiler feature. But I think some C# specs should be still improved. |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Sounds a bit more like macros rather than inlining.
Evidence?
Wow, CoreFX must have millions and millions of line of code if 100500+ examples of anything can be found in it.
Until it turns out that it isn't that smart and ends up producing code bloat that slows down the application.
As already pointed out, |
Beta Was this translation helpful? Give feedback.
-
FWIW here are the jit changes to unblock inlining a method that invokes a delegate: master...AndyAyersMS:UnblockInliningDelegateCallers. Didn't test these extensively but they seem to work. Feel free to grab them and build your own jit to see if it helps your scenario. Looking at the impact on things I can measure easily shows that this doesn't kick in very often, by itself doesn't lead to any downstream optimization opportunities, and causes some modest code growth as mainly we just end up duplicating the delegate invoke sites. So probably not something I would merge without some evidence that it actually helps performance.
|
Beta Was this translation helpful? Give feedback.
-
@AndyAyersMS thank you for your tip with custom jit.
So "move issue to JIT" is not an obvious answer. I will post issues to JIT(CoreCLR) repo, but also provide a link to this issue as an alternative approach. |
Beta Was this translation helpful? Give feedback.
-
What everybody actually wants is to keep optimizations that don't belong in the C# compiler out of the C# compiler. That's the way it works in pretty much all other compilers, the "parser" (usually known as front-end in the compiler world) doesn't do optimizations, it's the code generator (usually known as back-end) that does most optimizations. Front-ends may do some trivial optimizations (constant folding) or high level, language specific optimizations (e.g. I would expect Roslyn to produce reasonable IL code out of pattern matching and not offload that onto the JIT).
Existing AOT compilers do that already. CoreRT uses the CoreCLR JIT compiler and .NET Native uses the VC++ backend. Both do inlining and have done so for many years. You're suggesting doubling the work by adding inlining capabilities to the C# compiler.
Inlining logic is complex because it requires heuristics that are difficult to get right. Otherwise it would be relatively trivial compared to other optimizations that the JIT does.
Generally speaking, optimizations require a different abstraction than what the C# compiler uses now to represent code. So I fail to see the problem(s) you are seeing. |
Beta Was this translation helpful? Give feedback.
-
As far as I can tell, your modification changes the code from "never inline methods with delegate invocations" to "freely inline methods with delegate invocations". How hard would it be to change it instead to "only inline methods with delegate invocations if we're aggressively inlining"? That way, it doesn't hurt the common case where inlining such methods is not worth it, but I can also apply I'm assuming doing this would be just changing a few lines, similar to the change you showed. If it was more effort, this is probably not worth doing without further evidence. |
Beta Was this translation helpful? Give feedback.
-
That is certainly doable, yes. |
Beta Was this translation helpful? Give feedback.
-
How would this be solved by C#? Why do you expect C# to be better at inlining anything versus the JIT? We have literally 0 experience with it. It's not something defined at all in the language. It's something incredibly complex given the enormous size of the C# language. It isn't even defined (as far as i can tell) for totally normal scenarios like "i am referencing a dll". Saying that you don't like how well the JIT does inlining in no way constitutes an argument for why C# should have any involvement with inlining. |
Beta Was this translation helpful? Give feedback.
-
Can you please drop off this "OMG" childish attitude? It doesn't solve any problems. There are plenty of issues in that search that do not have anything to do with JIT inlining or just mention it in passing or that are plain bogus. You're not going to convince anyone that there is some kind of cosmic proportions issue that needs to be addressed in the specific manner you think it can be addressed. |
Beta Was this translation helpful? Give feedback.
-
Sorry I am not correct, I have changed my comment. |
Beta Was this translation helpful? Give feedback.
-
I have created an issue with the delegate invocation inlining problem: https://github.com/dotnet/coreclr/issues/17270 |
Beta Was this translation helpful? Give feedback.
-
What would be more useful is to try to understand the problem before proposing solutions or drawing conclusions. Just an example: the second issue in your screenshot is not exactly an inlining problem. That dispatch stub isn't an IL method that the JIT could inline like any other method, it's just some code automatically generated by the runtime for interface calls. There's nothing that the C# compiler or the JIT inliner proper can do about that. |
Beta Was this translation helpful? Give feedback.
-
My suggestion for the same... When developing productive code, you now often have to work with structures located on the stack. They cannot be created and initialized in some helper method and then returned to a usual method. You have to write all this high-performance shell in the body of a usual method. This makes it challenging to develop high-performance methods and destroys the purity and beauty of productive code... For example, I can't write the following helper method and use it in a usual method: // typical bytes initialization that can be used in many places
public static inline Span<byte> GetBytesOnStack(this Encoding encoding, string str) // inline keyword
{
Span<byte> bytes = stackalloc byte[encoding.GetByteCount(str)];
encoding.GetBytes(str, bytes);
return bytes; // Cannot use variable 'bytes' in this context because it may expose referenced variables outside of their declaration scope
}
bool SomeOptimizedMethod(string str)
{
Span<byte> bytes = Encoding.UTF8.GetBytesOnStack(str);
// work with bytes
}
// It will be compiled to
bool SomeOptimizedMethod(string str)
{
// Compiler inlined code
Encoding _generated_name1 = Encoding.UTF8;
Span<byte> _generated_name2 = stackalloc byte[_generated_name1.GetByteCount(str)];
_generated_name1.GetBytes(str, _generated_name2);
Span<byte> bytes = _generated_name2;
// End compiler inlined code
// work with bytes
} I suggest adding a new The suggestion can be extended to have helper methods generate both initialization code and finalization code... if needed... But that would be a more extended suggestion and can be discussed later if the inial suggestion is well received... F# is already support inlining see example for my code: https://sharplab.io/#v2:DYLgZgzgNAJiDUAfAsAKAPYAcCmA7ABALICWAxgE7oTpgAuAdAGIDKAFgIbmb0By7txAG7YAkrlrZKmNFjz5mATwgSAtvQAq2AB60ZOAouXYVaNMGy18xXMGvZ8AcwsAhBRIgB5XM1rtSAa3wACjxSdBhrBxB8AFFcMIjcBwBKYOVyaPTI1IBeNHwC/HNLcyTaVnwc/FDwyPoAcRc3bABhdABXcXx0/MLi/EF0YhgABVpySt7Cwr4BYTHyemU/f3ZgYHRSAB4AI2aAPiK8B3Kp6fxEQ9mhbAX6WnQANSHR8bPmTHZcXYOgweGFlAjmVWMlTKh+tQVNgPJgBCpiAAvbAwQgWVjhbrjSaoab9PbuSqOJruLw+FaxeK1JL0ACq6kYAA4seQzkEwaggA let inline getBytesOnStack (encoding: Encoding) (str: string) =
let length = encoding.GetByteCount str
let voidPtr =
NativePtr.stackalloc<byte> length
|> NativePtr.toVoidPtr
Span<byte>(voidPtr, length)
let someOptimizedMethod str =
let bytes = getBytesOnStack Encoding.UTF8 str
() will generate: public unsafe static void someOptimizedMethod(string str)
{
Encoding uTF = Encoding.UTF8;
int byteCount = uTF.GetByteCount(str);
Span<byte> span = new Span<byte>(stackalloc byte[byteCount], byteCount);
} |
Beta Was this translation helpful? Give feedback.
-
I think this can be inlined during JIT, (if not already done) it's just a matter of effort. If they inline |
Beta Was this translation helpful? Give feedback.
-
Note with Dynamic PGO (enabled by default in .NET 8 Preview 5) some delegates will become eligible for inlining, in particular at sites where there is just one possible (or one frequent) target method. |
Beta Was this translation helpful? Give feedback.
-
Maybe one more usecase for compile-time inlining: LINQ queryable providers analyze the expression tree and convert them to underlying database queries. I would like to define some query expressions in separate methods and reuse them, but the provider that I use is not able to properly convert those method call expressions that return the actual expression that is meant to be translated. Simplified pseudo example:
Maybe this is something for the particular linq provider to improve, but if the language would have offered a way to forcefully inline the QueryB(A) method call I would have been able to workaround this provider limitation. |
Beta Was this translation helpful? Give feedback.
-
The problem:
Currenty inlining only awailable on a JIT level, this approach have benefits (cross-module inlining) but this type of inlining is very restricted (method size, control flow restrictions, etc.)
In many cases inline is required within the same module or even class. This type of inlining can be performed at compile-time and have less restrictions than JIT inlining.
The solution:
Compile-Time attribute based inlining.
Example:
P.S. We can close this issue only with proof that "Type Classes" proposal (#110) can easily be used to perform this kind of inining.
P.P.S
C# moves closer and closer to C++ and this capability is a "must-have" feature for a high-performance language.
Otherwise some projects can choose C++ instead of C# only due to an aspects oververbosity of C# for certain project. (while C++ metaprogramming can be an overkill adding some postprocessing capabilities to C# would be great)
Beta Was this translation helpful? Give feedback.
All reactions