-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RyuJIT's loop cloning optimization has questionable CQ #4929
Labels
area-CodeGen-coreclr
CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
enhancement
Product code improvement that does NOT require public API changes/additions
optimization
tenet-performance
Performance related issue
Milestone
Comments
Reposting the original example (with updated assembly) to make this easier to read. static void Copy(int[] src, int[] dst, int length)
{
for (int i = 0; i < length; i++)
dst[i] = src[i];
} generates G_M1752_IG01:
4883EC28 sub rsp, 40
G_M1752_IG02:
33C0 xor eax, eax
; check if length is > 0, ok
4585C0 test r8d, r8d
7E7A jle SHORT G_M1752_IG06
; check if src and dst are non-null
; not needed because we already know that at least one loop iteration will be executed
; update: of course, this is not need in this particular example. In the general case these
; null checks are needed and it may be difficult to get rid of them.
4885C9 test rcx, rcx
410F95C1 setne r9b
450FB6C9 movzx r9, r9b
4885D2 test rdx, rdx
410F95C2 setne r10b
450FB6D2 movzx r10, r10b
4585CA test r9d, r10d
7441 je SHORT G_M1752_IG05
G_M1752_IG03:
; hoisted bound check, ok
44394108 cmp dword ptr [rcx+8], r8d
410F9DC1 setge r9b
450FB6C9 movzx r9, r9b
; check if length is >= 0, dubious
; the length was already checked at the start of the method
; previously this was `test r8, r8` because loop cloning tends to mix up integer types
; a change done in LowerCompare hides this issue
4585C0 test r8d, r8d
410F9DC2 setge r10b
450FB6D2 movzx r10, r10b
4523CA and r9d, r10d
; hoisted bound check, ok
44394208 cmp dword ptr [rdx+8], r8d
410F9DC2 setge r10b
450FB6D2 movzx r10, r10b
4585CA test r9d, r10d
7416 je SHORT G_M1752_IG05
; finally, the loop without range checks
G_M1752_IG04:
4C63C8 movsxd r9, eax
468B548910 mov r10d, dword ptr [rcx+4*r9+16]
4689548A10 mov dword ptr [rdx+4*r9+16], r10d
FFC0 inc eax
413BC0 cmp eax, r8d
7CEC jl SHORT G_M1752_IG04
; it's unfortunate that the loop that is normally executed is the one
; requiring a jmp, IG04 and IG05 should be reversed and IG05
; should probably be "run rarely"
EB1E jmp SHORT G_M1752_IG06
; and the loop with range checks
G_M1752_IG05:
3B4108 cmp eax, dword ptr [rcx+8]
731E jae SHORT G_M1752_IG07
4C63C8 movsxd r9, eax
468B548910 mov r10d, dword ptr [rcx+4*r9+16]
3B4208 cmp eax, dword ptr [rdx+8]
7311 jae SHORT G_M1752_IG07
4689548A10 mov dword ptr [rdx+4*r9+16], r10d
FFC0 inc eax
413BC0 cmp eax, r8d
7CE2 jl SHORT G_M1752_IG05
G_M1752_IG06:
4883C428 add rsp, 40
C3 ret
G_M1752_IG07:
E8D1FD395F call CORINFO_HELP_RNGCHKFAIL
CC int3 |
A few more things I noticed since creating (or didn't mention in) the original issue:
|
cc @AndyAyersMS You asked about the loop cloning issue, I added a few notes. |
25 tasks
JulieLeeMSFT
added
the
needs-further-triage
Issue has been initially triaged, but needs deeper consideration or reconsideration
label
Mar 23, 2021
JulieLeeMSFT
removed
the
needs-further-triage
Issue has been initially triaged, but needs deeper consideration or reconsideration
label
Jun 7, 2021
5 tasks
@BruceForstall should this be moved to future? Or are the issues above mostly handled now? |
They're mostly handled, so I'll go ahead and close this. |
ghost
locked as resolved and limited conversation to collaborators
May 28, 2022
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
area-CodeGen-coreclr
CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
enhancement
Product code improvement that does NOT require public API changes/additions
optimization
tenet-performance
Performance related issue
See CQ comments at the top of #4922.
category:cq
theme:loop-opt
skill-level:expert
cost:large
The text was updated successfully, but these errors were encountered: