-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support fast tailcalls in R2R #56669
Conversation
Partially addresses dotnet#5857
/azp run runtime-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
CI is running at https://dev.azure.com/dnceng/public/_build/results?buildId=1272307&view=results (normal) and https://dev.azure.com/dnceng/public/_build/results?buildId=1272336&view=results (outerloop) |
It would be nice to submit this one in separate PR. |
Will do. @MichalStrehovsky said there was a 6.0 issue for GS cookies somewhere, but I haven't been able to find it. Can anyone point me to it? |
I meant that not respecting UnsafeValueType meets the 6.0 bar (it's security) and needs to be fixed independently of anything else. I'm not aware of an existing issue tracking this. |
Sample ARM64 diffs: ; ReadyToRun compilation
; optimized code
; fp based frame
-; partially interruptible
+; fully interruptible
; No PGO data
; 0 inlinees with PGO data; 2 single block inlinees; 0 inlinees without PGO data
; Final local variable assignments
@@ -6705,14 +6692,13 @@ G_M62437_IG05:
adrp x11, [HIGH RELOC #0xd1ffab1e] // function address
add x11, x11, [LOW RELOC #0xd1ffab1e]
ldr x4, [x11]
- blr x4
- ;; bbWeight=0.50 PerfScore 8.00
+ ;; bbWeight=0.50 PerfScore 7.50
G_M62437_IG06:
ldp fp, lr, [sp],#16
- ret lr
+ br x4
;; bbWeight=0.50 PerfScore 1.00
-; Total bytes of code 72, prolog size 8, PerfScore 19.95, instruction count 18, allocated bytes for code 72 (MethodHash=96c50c1a) for method OrdinalComparer:GetHashCode(System.String):int:this
+; Total bytes of code 68, prolog size 8, PerfScore 19.05, instruction count 17, allocated bytes for code 68 (MethodHash=96c50c1a) for method OrdinalComparer:GetHashCode(System.String):int:this ; Assembly listing for method System.Collections.Generic.List`1:TrimExcess():this
; Emitting BLENDED_CODE for generic ARM64 CPU - Unix
; ReadyToRun compilation
; optimized code
; fp based frame
-; partially interruptible
+; fully interruptible
; No PGO data
; Final local variable assignments
;
@@ -11353,38 +11341,44 @@ G_M54312_IG02:
fcvtzs w1, d16
ldr w11, [x0,#16]
cmp w11, w1
- bge G_M54312_IG04
+ bge G_M54312_IG05
;; bbWeight=1 PerfScore 21.50
G_M54312_IG03:
mov w1, w11
adrp x11, [HIGH RELOC #0xd1ffab1e] // function address
add x11, x11, [LOW RELOC #0xd1ffab1e]
ldr x2, [x11]
- blr x2
- ;; bbWeight=0.50 PerfScore 2.75
+ ;; bbWeight=0.50 PerfScore 2.25
G_M54312_IG04:
+ ldp fp, lr, [sp],#16
+ br x2
+ ;; bbWeight=0.50 PerfScore 1.00
+G_M54312_IG05:
ldp fp, lr, [sp],#16
ret lr
- ;; bbWeight=1 PerfScore 2.00
+ ;; bbWeight=0.50 PerfScore 1.00
RWD00 dq 3FECCCCCCCCCCCCDh ; 0.9
-; Total bytes of code 72, prolog size 8, PerfScore 34.95, instruction count 18, allocated bytes for code 72 (MethodHash=d6d32bd7) for method System.Collections.Generic.List`1:TrimExcess():this
+; Total bytes of code 76, prolog size 8, PerfScore 34.85, instruction count 19, allocated bytes for code 76 (MethodHash=d6d32bd7) for method System.Collections.Generic.List`1:TrimExcess():this Summary (crossgen2 of frameworks + SPC):
Detail diffs
|
FWIW, the FSharp.Core.dll diffs look like the following. It indicates that ~1100 more functions were crossgenned. Summary of Code Size diffs:
(Lower is better)
Total bytes of base: 794036
Total bytes of diff: 977634
Total bytes of delta: 183598 (23.12% of base)
Total relative delta: -44.35
diff is a regression.
relative diff is an improvement.
Total byte diff includes 181609 bytes from reconciling methods
Base had 0 unique methods, 0 unique bytes
Diff had 1088 unique methods, 181609 unique bytes
Top file regressions (bytes):
183598 : FSharp.Core.dasm (23.12% of base)
1 total files with Code Size differences (0 improved, 1 regressed), 0 unchanged.
Top method regressions (bytes):
11879 ( ∞ of base) : FSharp.Core.dasm - Microsoft.FSharp.Quotations.FSharpExpr:GetLayout(bool):Microsoft.FSharp.Text.StructuredPrintfImpl.Layout:this (0 base, 1 diff methods)
9060 ( ∞ of base) : FSharp.Core.dasm - HashCompare:GenericEqualityObj$cont@1336(bool,System.Collections.IEqualityComparer,System.Object,System.Object,System.Array,Microsoft.FSharp.Core.Unit):bool (0 base, 1 diff methods)
6959 ( ∞ of base) : FSharp.Core.dasm - Microsoft.FSharp.Linq.QueryModule:EvalNonNestedOuter(int,Microsoft.FSharp.Quotations.FSharpExpr):System.Object (0 base, 1 diff methods)
5828 ( ∞ of base) : FSharp.Core.dasm - Microsoft.FSharp.Linq.QueryModule:TransNestedOuter(int,Microsoft.FSharp.Quotations.FSharpExpr):Microsoft.FSharp.Quotations.FSharpExpr (0 base, 1 diff methods)
4026 ( ∞ of base) : FSharp.Core.dasm - Microsoft.FSharp.Collections.FSharpMap`2:ToString():System.String:this (0 base, 3 diff methods)
2353 ( ∞ of base) : FSharp.Core.dasm - <StartupCode$FSharp-Core>.$Quotations:eq@197(Microsoft.FSharp.Quotations.Tree,Microsoft.FSharp.Quotations.Tree):bool (0 base, 1 diff methods)
2047 ( ∞ of base) : FSharp.Core.dasm - MakeGroupJoin@955:Invoke(System.Tuple`8[System.Boolean, System.Type, System.Type, System.Type, System.Type, Microsoft.FSharp.Quotations.FSharpExpr, Microsoft.FSharp.Quotations.FSharpExpr, System.Tuple`7[Microsoft.FSharp.Quotations.FSharpVar, Microsoft.FSharp.Quotations.FSharpExpr, Microsoft.FSharp.Quotations.FSharpVar, Microsoft.FSharp.Quotations.FSharpExpr, Microsoft.FSharp.Quotations.FSharpVar, Microsoft.FSharp.Quotations.FSharpVar, Microsoft.FSharp.Quotations.FSharpExpr]]):Microsoft.FSharp.Quotations.FSharpExpr:this (0 base, 1 diff methods)
1961 ( ∞ of base) : FSharp.Core.dasm - Microsoft.FSharp.Collections.FSharpList`1:ToString():System.String:this (0 base, 2 diff methods)
1960 ( ∞ of base) : FSharp.Core.dasm - MakeJoin@939:Invoke(System.Tuple`8[System.Boolean, System.Type, System.Type, System.Type, System.Type, Microsoft.FSharp.Quotations.FSharpExpr, Microsoft.FSharp.Quotations.FSharpExpr, System.Tuple`7[Microsoft.FSharp.Quotations.FSharpVar, Microsoft.FSharp.Quotations.FSharpExpr, Microsoft.FSharp.Quotations.FSharpVar, Microsoft.FSharp.Quotations.FSharpExpr, Microsoft.FSharp.Quotations.FSharpVar, Microsoft.FSharp.Quotations.FSharpVar, Microsoft.FSharp.Quotations.FSharpExpr]]):Microsoft.FSharp.Quotations.FSharpExpr:this (0 base, 1 diff methods)
1882 ( ∞ of base) : FSharp.Core.dasm - Microsoft.FSharp.Core.FSharpOption`1:Equals(System.Object,System.Collections.IEqualityComparer):bool:this (0 base, 14 diff methods)
1862 ( ∞ of base) : FSharp.Core.dasm - Microsoft.FSharp.Core.FSharpOption`1:CompareTo(System.Object,System.Collections.IComparer):int:this (0 base, 14 diff methods)
1779 ( ∞ of base) : FSharp.Core.dasm - <StartupCode$FSharp-Core>.$Quotations:Equals$cont@145-6(Microsoft.FSharp.Quotations.ExprConstInfo,Microsoft.FSharp.Quotations.ExprConstInfo,System.Collections.IEqualityComparer,Microsoft.FSharp.Core.Unit):bool (0 base, 1 diff methods)
1659 ( ∞ of base) : FSharp.Core.dasm - Microsoft.FSharp.Quotations.PatternsModule:typeOf(System.__Canon):System.Type (0 base, 1 diff methods)
1656 ( ∞ of base) : FSharp.Core.dasm - <StartupCode$FSharp-Core>.$Quotations:Equals$cont@145-8(Microsoft.FSharp.Quotations.ExprConstInfo,Microsoft.FSharp.Quotations.ExprConstInfo,Microsoft.FSharp.Core.Unit):bool (0 base, 1 diff methods)
1655 ( ∞ of base) : FSharp.Core.dasm - Microsoft.FSharp.Linq.RuntimeHelpers.Adapters:ConvImmutableTypeToMutableType(Microsoft.FSharp.Linq.RuntimeHelpers.Adapters+ConversionDescription,System.Type):System.Type (0 base, 1 diff methods)
1461 ( ∞ of base) : FSharp.Core.dasm - Microsoft.FSharp.Linq.QueryModule:ConvMutableToImmutable(Microsoft.FSharp.Linq.RuntimeHelpers.Adapters+ConversionDescription,Microsoft.FSharp.Quotations.FSharpExpr):Microsoft.FSharp.Quotations.FSharpExpr (0 base, 1 diff methods)
1435 ( ∞ of base) : FSharp.Core.dasm - Microsoft.FSharp.Reflection.Impl:mkTupleType(bool,System.Reflection.Assembly,System.Type[]):System.Type (0 base, 1 diff methods)
1362 ( ∞ of base) : FSharp.Core.dasm - Microsoft.FSharp.Linq.QueryModule:Make@575-3(bool,Microsoft.FSharp.Core.FSharpFunc`2[System.Tuple`2[Microsoft.FSharp.Collections.FSharpList`1[System.Type], Microsoft.FSharp.Collections.FSharpList`1[Microsoft.FSharp.Quotations.FSharpExpr]], Microsoft.FSharp.Quotations.FSharpExpr],Microsoft.FSharp.Core.FSharpFunc`2[System.Tuple`2[Microsoft.FSharp.Collections.FSharpList`1[System.Type], Microsoft.FSharp.Collections.FSharpList`1[Microsoft.FSharp.Quotations.FSharpExpr]], Microsoft.FSharp.Quotations.FSharpExpr],Microsoft.FSharp.Core.FSharpFunc`2[System.Tuple`2[Microsoft.FSharp.Collections.FSharpList`1[System.Type], Microsoft.FSharp.Collections.FSharpList`1[Microsoft.FSharp.Quotations.FSharpExpr]], Microsoft.FSharp.Quotations.FSharpExpr],Microsoft.FSharp.Core.FSharpFunc`2[System.Tuple`2[Microsoft.FSharp.Collections.FSharpList`1[System.Type], Microsoft.FSharp.Collections.FSharpList`1[Microsoft.FSharp.Quotations.FSharpExpr]], Microsoft.FSharp.Quotations.FSharpExpr],Microsoft.FSharp.Core.FSharpFunc`2[System.Tuple`2[Microsoft.FSharp.Collections.FSharpList`1[System.Type], Microsoft.FSharp.Collections.FSharpList`1[Microsoft.FSharp.Quotations.FSharpExpr]], Microsoft.FSharp.Quotations.FSharpExpr],Microsoft.FSharp.Core.FSharpFunc`2[System.Tuple`2[Microsoft.FSharp.Collections.FSharpList`1[System.Type], Microsoft.FSharp.Collections.FSharpList`1[Microsoft.FSharp.Quotations.FSharpExpr]], Microsoft.FSharp.Quotations.FSharpExpr],Microsoft.FSharp.Core.FSharpFunc`2[System.Tuple`2[Microsoft.FSharp.Collections.FSharpList`1[System.Type], Microsoft.FSharp.Collections.FSharpList`1[Microsoft.FSharp.Quotations.FSharpExpr]], Microsoft.FSharp.Quotations.FSharpExpr],Microsoft.FSharp.Core.FSharpFunc`2[System.Tuple`2[Microsoft.FSharp.Collections.FSharpList`1[System.Type], Microsoft.FSharp.Collections.FSharpList`1[Microsoft.FSharp.Quotations.FSharpExpr]], Microsoft.FSharp.Quotations.FSharpExpr],Microsoft.FSharp.Core.FSharpFunc`2[System.Tuple`2[Microsoft.FSharp.Collections.FSharpList`1[System.Type], Microsoft.FSharp.Collections.FSharpList`1[Microsoft.FSharp.Quotations.FSharpExpr]], Microsoft.FSharp.Quotations.FSharpExpr],Microsoft.FSharp.Core.FSharpFunc`2[System.Tuple`2[Microsoft.FSharp.Collections.FSharpList`1[System.Type], Microsoft.FSharp.Collections.FSharpList`1[Microsoft.FSharp.Quotations.FSharpExpr]], Microsoft.FSharp.Quotations.FSharpExpr],Microsoft.FSharp.Core.FSharpFunc`2[System.Tuple`3[Microsoft.FSharp.Quotations.FSharpExpr, Microsoft.FSharp.Collections.FSharpList`1[System.Type], Microsoft.FSharp.Collections.FSharpList`1[Microsoft.FSharp.Quotations.FSharpExpr]], Microsoft.FSharp.Quotations.FSharpExpr],Microsoft.FSharp.Quotations.FSharpExpr,bool,Microsoft.FSharp.Quotations.FSharpExpr,Microsoft.FSharp.Quotations.FSharpVar,Microsoft.FSharp.Quotations.FSharpExpr):Microsoft.FSharp.Quotations.FSharpExpr (0 base, 1 diff methods)
1312 ( ∞ of base) : FSharp.Core.dasm - MakeSelectMany@793:Invoke(System.Tuple`7[System.Boolean, System.Type, Microsoft.FSharp.Quotations.FSharpExpr, Microsoft.FSharp.Quotations.FSharpVar, Microsoft.FSharp.Quotations.FSharpExpr, Microsoft.FSharp.Quotations.FSharpVar, Microsoft.FSharp.Quotations.FSharpExpr]):Microsoft.FSharp.Quotations.FSharpExpr:this (0 base, 1 diff methods)
1240 ( ∞ of base) : FSharp.Core.dasm - PrintfEnv`3:RunSteps(System.Object[],System.Type[],Microsoft.FSharp.Core.PrintfImpl+Step[]):System.__Canon:this (0 base, 1 diff methods)
Top method improvements (bytes):
-24 (-23.53% of base) : FSharp.Core.dasm - Microsoft.FSharp.Control.AsyncActivation`1:get_IsCancellationRequested():bool:this (3 methods)
-20 (-32.26% of base) : FSharp.Core.dasm - mkSeq@132:System.Collections.IEnumerable.GetEnumerator():System.Collections.IEnumerator:this (2 methods)
-18 (-19.78% of base) : FSharp.Core.dasm - Microsoft.FSharp.Collections.FSharpMap`2:ContainsKey(int):bool:this (2 methods)
-18 (-33.33% of base) : FSharp.Core.dasm - Microsoft.FSharp.Collections.FSharpMap`2:GetHashCode():int:this (3 methods)
-18 (-19.78% of base) : FSharp.Core.dasm - Microsoft.FSharp.Collections.FSharpMap`2:System.Collections.Generic.IDictionary<'Key, 'Value>.ContainsKey(int):bool:this (2 methods)
-18 (-17.31% of base) : FSharp.Core.dasm - Microsoft.FSharp.Collections.FSharpMap`2:System.Collections.Generic.IDictionary<'Key, 'Value>.TryGetValue(int,byref):bool:this (2 methods)
-18 (-19.78% of base) : FSharp.Core.dasm - Microsoft.FSharp.Collections.FSharpMap`2:System.Collections.Generic.IReadOnlyDictionary<'Key, 'Value>.ContainsKey(int):bool:this (2 methods)
-18 (-17.31% of base) : FSharp.Core.dasm - Microsoft.FSharp.Collections.FSharpMap`2:System.Collections.Generic.IReadOnlyDictionary<'Key, 'Value>.TryGetValue(int,byref):bool:this (2 methods)
-18 (-17.31% of base) : FSharp.Core.dasm - Microsoft.FSharp.Collections.FSharpMap`2:TryGetValue(int,byref):bool:this (2 methods)
-16 (-47.06% of base) : FSharp.Core.dasm - Microsoft.FSharp.Collections.FSharpMap`2:get_Item(int):int:this
-16 (-47.06% of base) : FSharp.Core.dasm - Microsoft.FSharp.Collections.FSharpMap`2:System.Collections.Generic.IDictionary<'Key, 'Value>.get_Item(int):int:this
-16 (-47.06% of base) : FSharp.Core.dasm - Microsoft.FSharp.Collections.FSharpMap`2:System.Collections.Generic.IReadOnlyDictionary<'Key, 'Value>.get_Item(int):int:this
-16 (-41.03% of base) : FSharp.Core.dasm - Microsoft.FSharp.Reflection.UnionCaseInfo:GetFields():System.Reflection.PropertyInfo[]:this
-16 (-41.03% of base) : FSharp.Core.dasm - Microsoft.FSharp.Reflection.UnionCaseInfo:getMethInfo():System.Reflection.MethodInfo:this
-16 (-43.24% of base) : FSharp.Core.dasm - RangeByte@5429-1:System.Collections.Generic.IEnumerable<System.Byte>.GetEnumerator():System.Collections.Generic.IEnumerator`1[System.Byte]:this
-16 (-43.24% of base) : FSharp.Core.dasm - RangeByte@5429-1:System.Collections.IEnumerable.GetEnumerator():System.Collections.IEnumerator:this
-16 (-41.03% of base) : FSharp.Core.dasm - RangeInt16@5426-1:System.Collections.Generic.IEnumerable<System.Int16>.GetEnumerator():System.Collections.Generic.IEnumerator`1[System.Int16]:this
-16 (-41.03% of base) : FSharp.Core.dasm - RangeInt16@5426-1:System.Collections.IEnumerable.GetEnumerator():System.Collections.IEnumerator:this
-16 (-45.71% of base) : FSharp.Core.dasm - RangeInt32@5420-1:System.Collections.Generic.IEnumerable<System.Int32>.GetEnumerator():System.Collections.Generic.IEnumerator`1[System.Int32]:this
-16 (-45.71% of base) : FSharp.Core.dasm - RangeInt32@5420-1:System.Collections.IEnumerable.GetEnumerator():System.Collections.IEnumerator:this
Top method regressions (percentages):
21 ( ∞ of base) : FSharp.Core.dasm - |RecordFieldGetSimplification|_|@170:Invoke(System.Reflection.PropertyInfo):bool:this (0 base, 1 diff methods)
2353 ( ∞ of base) : FSharp.Core.dasm - <StartupCode$FSharp-Core>.$Quotations:eq@197(Microsoft.FSharp.Quotations.Tree,Microsoft.FSharp.Quotations.Tree):bool (0 base, 1 diff methods)
360 ( ∞ of base) : FSharp.Core.dasm - <StartupCode$FSharp-Core>.$Quotations:Equals$cont@137-5(Microsoft.FSharp.Quotations.Tree,Microsoft.FSharp.Quotations.Tree,System.Collections.IEqualityComparer,Microsoft.FSharp.Core.Unit):bool (0 base, 1 diff methods)
366 ( ∞ of base) : FSharp.Core.dasm - <StartupCode$FSharp-Core>.$Quotations:Equals$cont@137-7(Microsoft.FSharp.Quotations.Tree,Microsoft.FSharp.Quotations.Tree,Microsoft.FSharp.Core.Unit):bool (0 base, 1 diff methods)
1779 ( ∞ of base) : FSharp.Core.dasm - <StartupCode$FSharp-Core>.$Quotations:Equals$cont@145-6(Microsoft.FSharp.Quotations.ExprConstInfo,Microsoft.FSharp.Quotations.ExprConstInfo,System.Collections.IEqualityComparer,Microsoft.FSharp.Core.Unit):bool (0 base, 1 diff methods)
1656 ( ∞ of base) : FSharp.Core.dasm - <StartupCode$FSharp-Core>.$Quotations:Equals$cont@145-8(Microsoft.FSharp.Quotations.ExprConstInfo,Microsoft.FSharp.Quotations.ExprConstInfo,Microsoft.FSharp.Core.Unit):bool (0 base, 1 diff methods)
17 ( ∞ of base) : FSharp.Core.dasm - a@1498:Invoke(Microsoft.FSharp.Quotations.FSharpExpr):System.Type:this (0 base, 1 diff methods)
17 ( ∞ of base) : FSharp.Core.dasm - a@1498-1:Invoke(Microsoft.FSharp.Quotations.FSharpExpr):System.Type:this (0 base, 1 diff methods)
16 ( ∞ of base) : FSharp.Core.dasm - aboveListL@284:Invoke(Microsoft.FSharp.Text.StructuredPrintfImpl.Layout,Microsoft.FSharp.Text.StructuredPrintfImpl.Layout):Microsoft.FSharp.Text.StructuredPrintfImpl.Layout:this (0 base, 1 diff methods)
44 ( ∞ of base) : FSharp.Core.dasm - action@647-12:Invoke(Microsoft.FSharp.Core.Unit):Microsoft.FSharp.Control.AsyncReturn:this (0 base, 1 diff methods)
52 ( ∞ of base) : FSharp.Core.dasm - Adapt@3242:Invoke(int,int):bool:this (0 base, 1 diff methods)
53 ( ∞ of base) : FSharp.Core.dasm - Adapt@3242:Invoke(int,ushort):ushort:this (0 base, 1 diff methods)
53 ( ∞ of base) : FSharp.Core.dasm - Adapt@3242:Invoke(System.__Canon,System.__Canon):System.__Canon:this (0 base, 1 diff methods)
53 ( ∞ of base) : FSharp.Core.dasm - Adapt@3257-1:Invoke(System.__Canon,System.__Canon,System.__Canon):System.__Canon:this (0 base, 1 diff methods)
93 ( ∞ of base) : FSharp.Core.dasm - Adapt@3260-2:Invoke(System.__Canon,System.__Canon,System.__Canon):System.__Canon:this (0 base, 1 diff methods)
50 ( ∞ of base) : FSharp.Core.dasm - Adapt@3274-3:Invoke(System.__Canon,System.__Canon,System.__Canon,System.__Canon):System.__Canon:this (0 base, 1 diff methods)
101 ( ∞ of base) : FSharp.Core.dasm - Adapt@3279-4:Invoke(System.__Canon,System.__Canon,System.__Canon,System.__Canon):System.__Canon:this (0 base, 1 diff methods)
109 ( ∞ of base) : FSharp.Core.dasm - Adapt@3282-5:Invoke(System.__Canon,System.__Canon,System.__Canon,System.__Canon):System.__Canon:this (0 base, 1 diff methods)
60 ( ∞ of base) : FSharp.Core.dasm - Adapt@3299-6:Invoke(System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon):System.__Canon:this (0 base, 1 diff methods)
109 ( ∞ of base) : FSharp.Core.dasm - Adapt@3304-7:Invoke(System.__Canon,System.__Canon,System.__Canon,System.__Canon,System.__Canon):System.__Canon:this (0 base, 1 diff methods)
Top method improvements (percentages):
-16 (-47.06% of base) : FSharp.Core.dasm - Microsoft.FSharp.Collections.FSharpMap`2:get_Item(int):int:this
-16 (-47.06% of base) : FSharp.Core.dasm - Microsoft.FSharp.Collections.FSharpMap`2:System.Collections.Generic.IDictionary<'Key, 'Value>.get_Item(int):int:this
-16 (-47.06% of base) : FSharp.Core.dasm - Microsoft.FSharp.Collections.FSharpMap`2:System.Collections.Generic.IReadOnlyDictionary<'Key, 'Value>.get_Item(int):int:this
-16 (-45.71% of base) : FSharp.Core.dasm - RangeInt32@5420-1:System.Collections.Generic.IEnumerable<System.Int32>.GetEnumerator():System.Collections.Generic.IEnumerator`1[System.Int32]:this
-16 (-45.71% of base) : FSharp.Core.dasm - RangeInt32@5420-1:System.Collections.IEnumerable.GetEnumerator():System.Collections.IEnumerator:this
-16 (-45.71% of base) : FSharp.Core.dasm - RangeUInt32@5423-1:System.Collections.Generic.IEnumerable<System.UInt32>.GetEnumerator():System.Collections.Generic.IEnumerator`1[System.UInt32]:this
-16 (-45.71% of base) : FSharp.Core.dasm - RangeUInt32@5423-1:System.Collections.IEnumerable.GetEnumerator():System.Collections.IEnumerator:this
-13 (-44.83% of base) : FSharp.Core.dasm - -cctor@6137-17:Invoke(System.Decimal):System.Decimal:this
-16 (-43.24% of base) : FSharp.Core.dasm - RangeByte@5429-1:System.Collections.Generic.IEnumerable<System.Byte>.GetEnumerator():System.Collections.Generic.IEnumerator`1[System.Byte]:this
-16 (-43.24% of base) : FSharp.Core.dasm - RangeByte@5429-1:System.Collections.IEnumerable.GetEnumerator():System.Collections.IEnumerator:this
-16 (-43.24% of base) : FSharp.Core.dasm - RangeInt64@5421-1:System.Collections.Generic.IEnumerable<System.Int64>.GetEnumerator():System.Collections.Generic.IEnumerator`1[System.Int64]:this
-16 (-43.24% of base) : FSharp.Core.dasm - RangeInt64@5421-1:System.Collections.IEnumerable.GetEnumerator():System.Collections.IEnumerator:this
-16 (-43.24% of base) : FSharp.Core.dasm - RangeUInt16@5427-1:System.Collections.Generic.IEnumerable<System.UInt16>.GetEnumerator():System.Collections.Generic.IEnumerator`1[System.UInt16]:this
-16 (-43.24% of base) : FSharp.Core.dasm - RangeUInt16@5427-1:System.Collections.IEnumerable.GetEnumerator():System.Collections.IEnumerator:this
-16 (-43.24% of base) : FSharp.Core.dasm - RangeUInt64@5422-1:System.Collections.Generic.IEnumerable<System.UInt64>.GetEnumerator():System.Collections.Generic.IEnumerator`1[System.UInt64]:this
-16 (-43.24% of base) : FSharp.Core.dasm - RangeUInt64@5422-1:System.Collections.IEnumerable.GetEnumerator():System.Collections.IEnumerator:this
-16 (-43.24% of base) : FSharp.Core.dasm - substargs@1760:Invoke(Microsoft.FSharp.Quotations.FSharpExpr):Microsoft.FSharp.Quotations.FSharpExpr:this
-16 (-41.03% of base) : FSharp.Core.dasm - Microsoft.FSharp.Reflection.UnionCaseInfo:GetFields():System.Reflection.PropertyInfo[]:this
-16 (-41.03% of base) : FSharp.Core.dasm - Microsoft.FSharp.Reflection.UnionCaseInfo:getMethInfo():System.Reflection.MethodInfo:this
-16 (-41.03% of base) : FSharp.Core.dasm - RangeInt16@5426-1:System.Collections.Generic.IEnumerable<System.Int16>.GetEnumerator():System.Collections.Generic.IEnumerator`1[System.Int16]:this
2309 total methods with Code Size differences (288 improved, 2021 regressed), 5307 unchanged. |
That's great! Any idea what proportion remain non-cross-gen'd? And what's a typical proportion in other DLLs? |
For Windows x64 the data looks as follows. These results will be different for ARM64 or Linux x64. FSharp.Core.dll before this change:
FSharp.Core.dll after this change:
Hopefully these results can be approximately extrapolated to the full F# compiler stack. For reference here is System.Private.CoreLib.dll:
|
Thanks Jakob for making these improvements and for sharing the data, that indeed looks promising! For the SPC breakdown, by always throws an exception you mean that the compilation throws some other error bucket than those summarized below or that the original MSIL function contains just a single |
The latter (or rather, the check used is runtime/src/coreclr/tools/aot/ILCompiler.ReadyToRun/JitInterface/CorInfoImpl.ReadyToRun.cs Lines 472 to 487 in ea274b4
|
I see, thanks for the detailed explanation. I'm wondering whether that's what we want to keep long-term i.o.w. whether we have any metrics to assess to what extent the executable size savings compensate for the huge code graph and energy consumption incurred by spinning up JIT to code up the couple of dozens of assembly bytes. |
I think we are doing the right trade-off:
|
@jakobbotsch I see there are some |
@NinoFloris They are:
I also tried Microsoft.CodeAnalysis.CSharp.dll and it has a couple too, so it's not totally unique to F#:
|
@jakobbotsch This needs to bump the R2R minor version number. The number is duplicated in both src/coreclr/tools/Common/Internal/Runtime/ModuleHeaders.cs and src/coreclr/inc/readytorun.h |
I do not think it is required. This change does not add anything new to the format. Everything that it needed is supported by the runtime already - there are no runtime changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JIT changes LGTM.
Just one small question.
else | ||
{ | ||
// Register where we save call address in should not be overridden by epilog. | ||
assert((tmpReg & (RBM_INT_CALLEE_TRASH & ~RBM_LR)) == tmpReg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused about the role of tmpReg
in the fast tail call case. How does it end up having the right contents?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It happens in genCall
here: https://github.com/jakobbotsch/runtime/blob/9f70cc42d73467c15a458ee9774589271d721536/src/coreclr/jit/codegenarmarch.cpp#L2340-L2375
For normal calls we call genCall
which then calls genCallInstruction
that takes care to generate the code to load the call target and do the call.
For tailcalls we instead generate the code to load the call target in genCall
, but this is the last thing that happens when we see the GenTreeCall
node. The remaining work happens during epilog generation which also calls genCallInstruction
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you could pass REG_NA for the tail call case?
I guess it's GetSingleTempReg
that is confusing me. It seems odd to "allocate" a temp reg without altering state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems odd to "allocate" a temp reg without altering state.
Not sure I understand. The temp reg is allocated during RA for this specific optimization we do on ARM/ARM64, where we have no target node and need an extra register to store the call target loaded from the indirection cell. We still need it for the tail call case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, not that it's not ultimately needed, but that the value of tmpReg
passed here to GenCall is not used, instead we go call GetSingleTempReg
again.
It's not that important, I just found it confusing to follow how the value gets from one place to another.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code here in genCallInstruction
is called last. genCall
also calls GetSingleTempReg
, but it adds the register back to gtRsvdRegs
so that this call will succeed:
https://github.com/jakobbotsch/runtime/blob/9f70cc42d73467c15a458ee9774589271d721536/src/coreclr/jit/codegenarmarch.cpp#L2369-L2370
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, makes more sense now.
} | ||
|
||
return false; | ||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This cannot be just true. It must respect the same set of rules that are implemented in canTailCall in jitinterface.cpp. Notably if there is no tail prefix the following situations are forbidden from tail call optimization
- If the Caller method is NoInline
- If the Caller method is the entrypoint to the application
- If the Callee method has the RequireSecObject bit set on the MethodDef
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed this, PTAL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🕐
Do not do implicit tailcalls when * The caller is the entry point * The caller is marked NoInline * The callee requires security object
will this be in .NET 6? |
No, we currently do not plan to backport this work to .NET 6. |
// However, for tail calls, the call target is always computed in RBM_FASTTAILCALL_TARGET | ||
// and so do not optimize virtual stub calls for such cases. | ||
shouldOptimizeVirtualStubCall = !call->IsTailCall(); | ||
shouldOptimizeVirtualStubCall = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
so what's the release train? thx |
It will be part of .NET 7 which is scheduled for end of next year. There are (unsupported) SDKs at https://github.com/dotnet/installer#installers-and-binaries that already should contain this change in case you wish to play around with it. |
Implement a delay load helper that supports fast tailcalls on x64. The JIT loads the indirection cell into
rax
and emitsjmp [rax]
. The new helper uses the indirection fromrax
.Also add support for containing immediate indirs in the tailcalling. We cannot contain any indir since they may need values in registers that have been cleaned up.
ARM64 almost supported this out of the box since there the helper already gets the indirection cell from a register. The only change needed was to properly load the call target.
I also took the JIT GUID update as an opportunity to clean up the
mcPackets
enum: I have sorted the entries by their actual values and commented out the unused ones.Partially addresses #5857