-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable folding of implementation-defined casts #53782
Disable folding of implementation-defined casts #53782
Conversation
It is not very effective right now because the "bad" combinations of types are morphed into helpers too early, but it will be helpful when we enable folding outside of the importer for those cases too.
It now fails on Windows x86, where many of the previous similar failures were observed.
d47687b
to
d13b563
Compare
CC. @dotnet/jit-contrib, community PR |
@kunalspathak PTAL for community PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some questions. Could you share some of the sample diffs that you see in the test (just to understand how this works underlying). Also, did you check asmdiffs on benchmarks/libraries?
Yep, the PR's header is for results of running SPMI over all available collections, so, no diffs outside tests.
Will recollect the diffs and share the samples. |
Case 1: G_M64081_IG01:
sub rsp, 40
-
+ vzeroupper
+
G_M64081_IG02:
movsx rcx, cl
xor ecx, 5
@@ -25,22 +26,41 @@ G_M64081_IG02: ; gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, isz
add rcx, rax
mov rax, rcx
shr rax, 32
- jne SHORT G_M64081_IG03
+ jne SHORT G_M64081_IG04
xor ecx, ecx
call CORINFO_HELP_OVERFLOW
; gcr arg pop 0
- xor ecx, ecx
- call CORINFO_HELP_OVERFLOW
+ vmovsd xmm0, qword ptr [reloc @RWD00]
+ call CORINFO_HELP_DBL2ULNG
; gcr arg pop 0
- int3
- ;; bbWeight=1 PerfScore 5.75
-G_M64081_IG03: ; gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
+ vxorps xmm0, xmm0
+ vcvtsi2ss xmm0, rax
+ vcvtss2sd xmm0, xmm0
+ call CORINFO_HELP_DBL2ULNG
+ ; gcr arg pop 0
+ mov rdx, 0xD1FFAB1E
+ imul rax, rdx
+ mov rdx, 0xD1FFAB1E
+ add rax, rdx
+ jo SHORT G_M64081_IG04
+ cmp rax, 0xD1FFAB1E
+ jg SHORT G_M64081_IG04
+ cmp rax, 0xD1FFAB1E
+ jl SHORT G_M64081_IG04
+ ;; bbWeight=1 PerfScore 23.83
+G_M64081_IG03: ; , epilog, nogc, extend
+ add rsp, 40
+ ret
+ ;; bbWeight=1 PerfScore 1.25
+G_M64081_IG04: ; gcVars=0000000000000000 {}, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, gcvars, byref
call CORINFO_HELP_OVERFLOW
; gcr arg pop 0
int3
;; bbWeight=0 PerfScore 0.00
+RWD00 dq 4D95800D5C7F3082h ; 5.66058935e+65 Here the early importer folding is blocked, due to the new checks in [ 1] 100 (0x064) nop
[ 1] 101 (0x065) ldc.r8 5.6605893542014790e+65
[ 2] 110 (0x06e) conv.u
-Folding long operator with constant nodes into a constant:
- [000025] ------------ * CAST long <- ulong <- double
- [000024] ------------ \--* CNS_DBL double 5.6605893542014790e+65
-Bashed to long constant:
- [000025] ------------ * CNS_INT long 0
-
[ 2] 111 (0x06f) conv.r4
-Folding fp operator with constant nodes into a fp constant:
- [000026] ------------ * CAST float <- long
- [000025] ------------ \--* CNS_INT long 0
-Bashed to fp constant:
- [000026] ------------ * CNS_DBL float 0.0000000000000000
-
[ 2] 112 (0x070) conv.u8
-Folding long operator with constant nodes into a constant:
- [000027] ------------ * CAST long <- ulong <- float
- [000026] ------------ \--* CNS_DBL float 0.0000000000000000
-Bashed to long constant:
- [000027] ------------ * CNS_INT long 0
-
[ 2] 113 (0x071) mul
[ 1] 114 (0x072) ldc.r4 2281166340096.0000
[ 2] 119 (0x077) ldc.r8 -1.4862075407321284e+46
@@ -475,7 +457,10 @@ STMT00009 (IL ???... ???)
[000095] ---X-------- \--* ADD_ovfl long
[000028] ------------ +--* MUL long
[000018] ------------ | +--* CNS_INT long -0xf925aa1447c2972
- [000027] ------------ | \--* CNS_INT long 0
+ [000027] ------------ | \--* CAST long <- ulong <- float
+ [000026] ------------ | \--* CAST float <- long
+ [000025] ------------ | \--* CAST long <- ulong <- double
+ [000024] ------------ | \--* CNS_DBL double 5.6605893542014790e+65
[000094] ---X-------- \--* SUB_ovfl long Case 2: - mov eax, 100
- ;; bbWeight=1 PerfScore 0.25
+ vcvttss2si rax, qword ptr [reloc @RWD00]
+ cmp rax, 0xD1FFAB1E
+ setl dl
+ movzx rdx, dl
+ movsxd rdx, edx
+ cmp rdx, rax
+ setg al
+ movzx rax, al
+ cmp eax, 0xD1FFAB1E
+ seta al
+ movzx rax, al
+ add eax, 100
+ ;; bbWeight=1 PerfScore 12.00
G_M58196_IG03: ; , epilog, nogc, extend
ret
;; bbWeight=1 PerfScore 1.00
+RWD00 dd 63E756A3h ; 8.53488e+21 This time, local assertion propagation does not result in @@ -518,56 +512,35 @@ Assertion prop in BB01:
Constant Assertion: V00 == 8.53488e+21 index=#01, mask=0000000000000001
[000020] ------------ * CNS_DBL float 8.5348814290707441e+21
-Folding long operator with constant nodes into a constant:
- [000021] ------------ * CAST long <- float
- [000020] -----+------ \--* CNS_DBL float 8.5348814290707441e+21
-Bashed to long constant:
- [000021] ------------ * CNS_INT long -0x8000000000000000
-
-Folding operator with constant nodes into a constant:
- [000023] ------------ * GT int
- [000022] -----+------ +--* CNS_INT long 0x45E23077
- [000021] -----+------ \--* CNS_INT long -0x8000000000000000
-Bashed to int constant:
- [000023] ------------ * CNS_INT int 1
-
-Folding long operator with constant nodes into a constant:
- [000025] ------------ * CAST long <- int
- [000023] -----+------ \--* CNS_INT int 1
-Bashed to long constant:
- [000025] ------------ * CNS_INT long 1 Case 3: G_M30804_IG01: ; gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, nogc <-- Prolog IG
- sub rsp, 40
- ;; bbWeight=1 PerfScore 0.25
+ push rsi
+ sub rsp, 32
+ ;; bbWeight=1 PerfScore 1.25
G_M30804_IG02: ; gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
mov rcx, 0xD1FFAB1E
mov rcx, gword ptr [rcx]
@@ -21,17 +22,22 @@ G_M30804_IG02: ; gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
call hackishModuleName:hackishMethodName()
; gcrRegs -[rcx]
; gcr arg pop 0
- xor ecx, ecx
+ vcvttsd2si ecx, dword ptr [reloc @RWD00]
+ movsx rsi, cl
+ mov ecx, esi
call hackishModuleName:hackishMethodName()
; gcr arg pop 0
- xor eax, eax
- ;; bbWeight=1 PerfScore 4.75
+ movsx rax, sil
+ ;; bbWeight=1 PerfScore 12.00
G_M30804_IG03: ; , epilog, nogc, extend
- add rsp, 40
+ add rsp, 32
+ pop rsi
ret
- ;; bbWeight=1 PerfScore 1.25
+ ;; bbWeight=1 PerfScore 1.75
+RWD00 dq FFF8000000000000h ; -nan(ind) Here, VN block takes hold: N001 [000002] CNS_DBL -nan(ind) => $140 {DblCns[-nan(ind)]}
VNForCastOper(int) is $41
-N002 [000013] CAST => $42 {IntCns 0x80000000}
- VNForCastOper(byte) is $43
-N003 [000003] CAST => $44 {IntCns 0}
-N004 [000004] LCL_VAR V01 tmp1 d:1 => $44 {IntCns 0}
-N005 [000005] ASG => $44 {IntCns 0}
+N002 [000013] CAST => $180 {Cast($140, $41)}
+ VNForCastOper(byte) is $42
+N003 [000003] CAST => $181 {Cast($180, $42)}
+N004 [000004] LCL_VAR V01 tmp1 d:1 => $181 {Cast($180, $42)}
+N005 [000005] ASG => $181 {Cast($180, $42)}
***** BB01, STMT00001(after)
-N005 ( 10, 12) [000005] -A------R--- * ASG int $44
-N004 ( 1, 1) [000004] D------N---- +--* LCL_VAR int V01 tmp1 d:1 $44
-N003 ( 10, 12) [000003] ------------ \--* CAST int <- byte <- int $44
-N002 ( 9, 10) [000013] ------------ \--* CAST int <- double $42
+N005 ( 10, 12) [000005] -A------R--- * ASG int $181
+N004 ( 1, 1) [000004] D------N---- +--* LCL_VAR int V01 tmp1 d:1 $181
+N003 ( 10, 12) [000003] ------------ \--* CAST int <- byte <- int $181
+N002 ( 9, 10) [000013] ------------ \--* CAST int <- double $180
N001 ( 3, 4) [000002] ------------ \--* CNS_DBL double -nan(ind) $140 This means no constant propagation in assertion prop: -After constant propagation on [000003]:
-STMT00001 (IL 0x00A... ???)
-N005 ( 10, 12) [000005] -A------R--- * ASG int $44
-N004 ( 1, 1) [000004] D------N---- +--* LCL_VAR int V01 tmp1 d:1 $44
- [000016] ------------ \--* CNS_INT int 0 $44
-optVNAssertionPropCurStmt morphed tree:
-N003 ( 1, 3) [000005] -A------R--- * ASG int $44
-N002 ( 1, 1) [000004] D------N---- +--* LCL_VAR int V01 tmp1 d:1 $44
-N001 ( 1, 1) [000016] ------------ \--* CNS_INT int 0 $44
-
-After constant propagation on [000007]:
-STMT00002 (IL ???... ???)
-N003 ( 15, 7) [000008] --CXG------- * CALL void hackishModuleName.hackishMethodName $VN.Void
- [000017] ------------ arg0 in rcx \--* CNS_INT int 0 $44
-ReMorphing args for 8.CALL:
-argSlots=1, preallocatedArgCount=4, nextSlotNum=4, nextSlotByteOffset=32, outgoingArgSpaceSize=32
-ArgTable for 8.CALL after fgMorphArgs:
-fgArgTabEntry[arg 0 17.CNS_INT int (By ref), 1 reg: rcx, byteAlignment=8, lateArgInx=0, processed]
-
-optVNAssertionPropCurStmt morphed tree:
-N003 ( 15, 7) [000008] --CXG------- * CALL void hackishModuleName.hackishMethodName $VN.Void
-N002 ( 1, 1) [000017] ------------ arg0 in rcx \--* CNS_INT int 0 $44
-
-After constant propagation on [000015]:
-STMT00003 (IL 0x01A... ???)
-N003 ( 3, 4) [000009] ------------ * RETURN int $182
- [000018] ------------ \--* CNS_INT int 0 $44
-optVNAssertionPropCurStmt morphed tree:
-N002 ( 2, 2) [000009] ------------ * RETURN int $182
-N001 ( 1, 1) [000018] ------------ \--* CNS_INT int 0 $44 Case 4: -; Lcl frame size = 40
+; Lcl frame size = 32
G_M60593_IG01: ; gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, nogc <-- Prolog IG
- sub rsp, 40
- ;; bbWeight=1 PerfScore 0.25
+ push rsi
+ sub rsp, 32
+ ;; bbWeight=1 PerfScore 1.25
G_M60593_IG02: ; gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
mov rcx, 0xD1FFAB1E
mov rcx, gword ptr [rcx]
@@ -21,17 +22,21 @@ G_M60593_IG02: ; gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
call hackishModuleName:hackishMethodName()
; gcrRegs -[rcx]
; gcr arg pop 0
- mov rcx, 0xD1FFAB1E
+ vcvttsd2si rsi, qword ptr [reloc @RWD00]
+ mov rcx, rsi
call hackishModuleName:hackishMethodName()
; gcr arg pop 0
- mov rax, 0xD1FFAB1E
- ;; bbWeight=1 PerfScore 4.75
+ mov rax, rsi
+ ;; bbWeight=1 PerfScore 11.75
G_M60593_IG03: ; , epilog, nogc, extend
- add rsp, 40
+ add rsp, 32
+ pop rsi
ret
- ;; bbWeight=1 PerfScore 1.25
+ ;; bbWeight=1 PerfScore 1.75
+RWD00 dq FFF8000000000000h ; -nan(ind)
+
-; Total bytes of code 52, prolog size 4, PerfScore 11.45, instruction count 9, allocated bytes for code 52 (MethodHash=1dad134e) for method ConvTest:test4():long
+; Total bytes of code 49, prolog size 5, PerfScore 19.65, instruction count 12, allocated bytes for code 49 (MethodHash=1dad134e) for method ConvTest:test4():long This is another case of no folding in VN: -N002 [000003] CAST => $180 {LngCns: 0x8000000000000000}
-N003 [000004] LCL_VAR V01 tmp1 d:1 => $180 {LngCns: 0x8000000000000000}
-N004 [000005] ASG => $180 {LngCns: 0x8000000000000000}
+N002 [000003] CAST => $180 {Cast($140, $41)}
+N003 [000004] LCL_VAR V01 tmp1 d:1 => $180 {Cast($140, $41)}
+N004 [000005] ASG => $180 {Cast($140, $41)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for your contribution!
Where they originally were.
@SingleAccretion - do you mind taking a look into failures to make sure they are not related to your changes? |
@kunalspathak yes of course. Unfortunately the CI is not in the greatest of shapes right now :(.
This looks like #47444. Seeing as this is a rare failure, maybe we can extract some useful information from this run...?
A rich assortment to be sure... "Real" failures seem to be HTTP one, and the interpreter one, though I find it hard to see how they could be caused by this change (note as well that it passed CI on first run, and no code changes have been committed since). |
With #51440 merged, it is now easy to do the intended second part of that PR and disable folding of all implementation-defined
float/double -> integer
casts. This is a temporary solution to #47478, but without it I will not be able to move forward with the casts work as a lot of this has so far been masked by the fact that problematic conversions were being morphed into calls and thus not getting folded.For the diffs, the expectation is that there will be some regressions in
coreclr_tests
.Windows x64
Windows x86
Linux x64
Linux ARM64
Linux ARM
Which the replays confirm.
Closes #51346.
Closes #47374.