-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Remove unneeded unconditional jumps #68119
Conversation
Tagging subscribers to this area: @JulieLeeMSFT Issue Detailsfixes #4326 When compilng some simple methods a jmp instruction is generated that ends up jumping to the next instruction which is not needed. The example provided is: public static int Main(string[] args)
{
// first test case
return Math.Max(args.Length, 42);
} This produces the assembly: G_M22359_IG01: ;; offset=0000H
;; size=0 bbWeight=1 PerfScore 0.00
G_M22359_IG02: ;; offset=0000H
8B4108 mov eax, dword ptr [rcx+8]
83F82A cmp eax, 42
7D07 jge SHORT G_M22359_IG04
;; size=8 bbWeight=1 PerfScore 3.25
G_M22359_IG03: ;; offset=0008H
B82A000000 mov eax, 42
+ EB00 jmp SHORT G_M22359_IG04
;; size=7 bbWeight=0.18 PerfScore 0.41
G_M22359_IG04: ;; offset=000FH
C3 ret where you can see that the jmp to The jump is added in the emitter phase and cannot be seen before that so the changes i have made are all very late in the compilation process. The jump exists because a basic block between the blocks that create instruction groups 3 and 4 generates no instructions. Emission of instructions is linear so at the time when IG03 is generated there is a jump present from BB03 to BB05 and this causes an unconditional jump to be generated at the end of IG03. When BB04 is processed it generates no instructions because there is no need anything and the process moves on to BB05 (which will generate IG04) but we're left with a weird looking jump. The unneeded jumps can be identified once instruction groups have been generated but this must be done before label sizes and alignment are calculated which gives a small window of opportunity. I chose to integrate this process into the
if these conditions are met the jump is removed for the emitter jump list and the instruction group then the instruction groups is flagged as needing a size recalculation and as having an update instruction count. Once this is done the remaining jumps are sized and updated as normal. The second change needs to be done at the codegen level and it involves updating any live ranges which include the instruction which has been removed. I have added a new function Once all that is done the unconditional jumps are removed. It looks like they get removed from some very basic and commonly used functions which end up bieng inlined almost everywhere. The spmi replays are clean and the asm diffs are surprising.
all regressions are caused by alignment changes. benchmarks.run.windows.x64.checked.mch:
Detail diffs
coreclr_tests.pmi.windows.x64.checked.mch:
Detail diffs
libraries.crossgen2.windows.x64.checked.mch:
Detail diffs
libraries.pmi.windows.x64.checked.mch:
Detail diffs
libraries_tests.pmi.windows.x64.checked.mch:
Detail diffs
/cc @jit-contrib
|
The OSX failures looks related but I have no way to debug on that platform, not sure about the other failures, |
@@ -277,6 +278,9 @@ struct insGroup | |||
// and the emitter should continue to track GC info as if there was no new block. | |||
#define IGF_HAS_ALIGN 0x0400 // this group contains an alignment instruction(s) at the end to align either the next | |||
// IG, or, if this IG contains with an unconditional branch, some subsequent IG. | |||
#define IGF_UPD_ICOUNT \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More generally, this will invalidate any and all emitLocation
s captured for the IG before the update. They're also used for USING_SCOPE_INFO
and unwinding, I wonder if the latter could explain the CI failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be strange if that were the case only on osx which is the only pattern I can see, the other failures are weird things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, the CI is red everywhere, e. g. in coreclr Pri0 Runtime Tests Run windows x64 checked failed
we see:
JIT\opt\Inline\tests\Inline_DetectChanges\Inline_DetectChanges.cmd [FAIL]
Unhandled exception. System.NullReferenceException: Object reference not set to an instance of an object.
at Program.Foo(Object o)
at Program.Main()
Which does look a lot like an unwinding failure (-- EH code getting confused and not catching the exception). But this is just a guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I tried to change things I'd be guessing too. What I've done is what's is needed to achieve the desired result and satisfy the existing sanity checking assertions inside the runtime. If more needs to be done around liveness tracking then I'll need some direction from a person that knows how it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done some experimenting and I can't see a cause or change the outcome. I've tried excluding prolog and epilog from removal and it has no effect. In discussions on discord we talked about whether the unwinder is being messed up somehow but I don't see how it can be when the unwind info is generated after all the assembly has been generated. Similarly any error handling has been generated into funclet instruction groups so there are no ranges to worry about. The jumps which are being removed are added after each basic block has been emitted into an IG so there shouldn't be any scope for liveness failures or registers being changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried taking a spare bit in the instrDesc
structure and setting it only on jmps emitted at the end of always jump marked basic blocks. Then used that to select if the jump can be removed so that nothing can be accidentally removed in finally blocks etc. It still breaks. I'm unable to continue, I have no nowhere to go. The exception handling seems to be broken by this change somehow but I don't know how or where without informed guidance. It's clear that there are benefits to this imo but I can't complete it from where it is currently.
Closing because I can't get exception handling to function with this. |
fixes #4326
When compilng some simple methods a jmp instruction is generated that ends up jumping to the next instruction which is not needed. The example provided is:
This produces the assembly:
G_M22359_IG01: ;; offset=0000H ;; size=0 bbWeight=1 PerfScore 0.00 G_M22359_IG02: ;; offset=0000H 8B4108 mov eax, dword ptr [rcx+8] 83F82A cmp eax, 42 7D07 jge SHORT G_M22359_IG04 ;; size=8 bbWeight=1 PerfScore 3.25 G_M22359_IG03: ;; offset=0008H B82A000000 mov eax, 42 + EB00 jmp SHORT G_M22359_IG04 ;; size=7 bbWeight=0.18 PerfScore 0.41 G_M22359_IG04: ;; offset=000FH C3 ret
where you can see that the jmp to
SHORT G_M22359_IG04
is not needed because that is the next instruction, we will save 2 bytes and possibly some cpu jump tracking resources by not including the jump.The jump is added in the emitter phase and cannot be seen before that so the changes i have made are all very late in the compilation process. The jump exists because a basic block between the blocks that create instruction groups 3 and 4 generates no instructions. Emission of instructions is linear so at the time when IG03 is generated there is a jump present from BB03 to BB05 and this causes an unconditional jump to be generated at the end of IG03. When BB04 is processed it generates no instructions because there is no need anything and the process moves on to BB05 (which will generate IG04) but we're left with a weird looking jump.
The unneeded jumps can be identified once instruction groups have been generated but this must be done before label sizes and alignment are calculated which gives a small window of opportunity. I chose to integrate this process into the
emitter::emitJumpDistBind()
function because it was correctly placed to have all the information needed and already contained logic for looping around the jumps and adjusting the size of instructions (but not groups). I have added a loop around each jump in the emitter jump list which attempts to identify if the jump isif these conditions are met the jump is removed for the emitter jump list and the instruction group then the instruction groups is flagged as needing a size recalculation and as having an update instruction count. Once this is done the remaining jumps are sized and updated as normal.
The second change needs to be done at the codegen level and it involves updating any live ranges which include the instruction which has been removed. I have added a new function
genUpdateLiveRangesForTruncatedIGs
which is run afteremitJumpDistBind
where we know that all instructions that are going to be removed have been and beforeemitLoopAlignAdjustments
which will check alignments and may need to emit additional nops into the instruction groups. The live range checks need a new count accessor function because they haven't been finalized yet at the point when the count is needed.Once all that is done the unconditional jumps are removed. It looks like they get removed from some very basic and commonly used functions which end up bieng inlined almost everywhere. The spmi replays are clean and the asm diffs are surprising.
all regressions are caused by alignment changes.
benchmarks.run.windows.x64.checked.mch:
Detail diffs
coreclr_tests.pmi.windows.x64.checked.mch:
Detail diffs
libraries.crossgen2.windows.x64.checked.mch:
Detail diffs
libraries.pmi.windows.x64.checked.mch:
Detail diffs
libraries_tests.pmi.windows.x64.checked.mch:
Detail diffs
/cc @jit-contrib