-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable fake hot/cold splitting on ARM64 #70708
Enable fake hot/cold splitting on ARM64 #70708
Conversation
- Remove NYIs/control flow preventing code splitting on ARM64. - Update emitter::emitIns_J() to keep jumps between hot/cold sections long. - Update emitter::emitOutputLJ() to emit long jumps for both conditional and unconditional branches between hot/cold sections, and report relocations to the runtime. - Update long ldr pseudoinstruction to instead use ld1 instruction when loading 16-byte constants into vector registers; ldr implementation temporarily loads the constant into a general integer register, which does not support 16-byte values.
- Remove NYIs/control flow preventing code splitting on ARM64. - Update emitter::emitIns_J() to keep jumps between hot/cold sections long. - Update emitter::emitOutputLJ() to emit long jumps for both conditional and unconditional branches between hot/cold sections, and report relocations to the runtime. - Update long ldr pseudoinstruction to instead use ld1 instruction when loading 16-byte constants into vector registers; ldr implementation temporarily loads the constant into a general integer register, which does not support 16-byte values.
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsThis PR enables fake hot/cold splitting on ARM64 via
Fake hot/cold splitting requires we fake unwind info by treating each split function as one hot section. A more architecture-agnostic approach for this has been applied. However, unwind info generated for cold sections has yet to be tested, as this depends on VM support for hot/cold splitting.
|
/azp run runtime-jit-experimental |
Azure Pipelines successfully started running 1 pipeline(s). |
This commit contains fixes for various bugs exposed by enabling fake hot/cold splitting on ARM64: - Branches between hot/cold sections are now always long. - The pseudoinstruction for loading a constant from the cold section did not support loading 16-byte data into vector registers, as it temporarily loaded the constant into an 8-byte integer register. Now, 16-byte constants are loaded directly into vector registers via an `ld1` instruction. - Tests involving loading 16-byte constants exposed the data section is not always aligned to its largest constant. Now, the data section is always aligned to `emitConsDsc.alignment` when calling `eeAllocMem`. - Asserts/NYIs blocking hot/cold splitting on ARM64 have been removed. Fake hot/cold splitting requires we fake unwind info by treating each split function as one hot section. A more architecture-agnostic approach for this has been applied.
de27dac
to
d2bbed8
Compare
/azp run runtime-jit-experimental |
Azure Pipelines successfully started running 1 pipeline(s). |
Closing to prevent re-running checks. |
This commit contains fixes for various bugs exposed by enabling fake hot/cold splitting on ARM64: - Branches between hot/cold sections are now always long. - The pseudoinstruction for loading a constant from the cold section did not support loading 16-byte data into vector registers, as it temporarily loaded the constant into an 8-byte integer register. Now, 16-byte constants are loaded directly into vector registers via an `ld1` instruction. - Tests involving loading 16-byte constants exposed the data section is not always aligned to its largest constant. Now, the data section is always aligned to `emitConsDsc.alignment` when calling `eeAllocMem`. - Asserts/NYIs blocking hot/cold splitting on ARM64 have been removed. Fake hot/cold splitting requires we fake unwind info by treating each split function as one hot section. A more architecture-agnostic approach for this has been applied.
…d/runtime into code-splitting-arm
The newly-introduced `emitRemoveJumpToNextInst` optimization caused a regression when hot/cold-splitting, where jumps from the last hot instruction to the first cold instruction were erroneously removed. This is fixed by disabling the `isRemovableJmpCandidate` flag for branches between hot/cold sections. On an unrelated note, a JIT dump message has been added to indicate stress-splitting is occurring.
…d/runtime into code-splitting-arm
I've added commits that update the fake-splitting implementation to place the read-only data section after the cold section on ARM64. This allows the hot/cold sections to be truly contiguous -- previously, the read-only data section was placed after the hot section. This placement should better facilitate generating fake unwind info, thus fixing stack walks when fake-splitting. |
/azp run runtime-jit-experimental |
Azure Pipelines successfully started running 1 pipeline(s). |
c793739
to
eae7ee5
Compare
I realized my implementation for aligning the data section when fake-splitting is overzealous, and simply tweaking the current implementation to align to @BruceForstall PTAL. I noticed |
/azp run runtime-jit-experimental Edit: Checks didn't trigger for some reason... I'll wait to restart them after addressing feedback. |
Looks like a (bad) recent regression. I opened #71023. |
/azp run runtime |
No pipelines are associated with this pull request. |
Azure Pipelines successfully started running 1 pipeline(s). |
1 similar comment
Azure Pipelines successfully started running 1 pipeline(s). |
To facilitate generating unwind info, fake-splitting now places the read-only data section after the cold section. This allows the hot/cold code sections to be truly contiguous.
7be19a6
to
8ed9046
Compare
@BruceForstall PTAL at |
I like this direction. Unfortunately, and I apologize, this change is going to clash with my change #71044. Can we wait until that is merged (today), and then review this? You'll need to pick up its changes into this. |
No worries! |
@BruceForstall I merged your changes from #71044 in, and added a guard to disable splitting on LoongArch64. Thanks for the ARM64 alignment work -- I recall hitting some related asserts when trying to load 16-byte constants a few weeks ago... |
/azp run runtime-jit-experimental |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Merging this PR for now. Will submit a follow-up PR disabling hot/cold splitting when |
This PR enables fake hot/cold splitting on ARM64 via
COMPlus_JitFakeProcedureSplitting
, and fixes various bugs exposed by testing with fake-splitting:ld1
instruction.emitConsDsc.alignment
when callingeeAllocMem
.Fake hot/cold splitting requires we fake unwind info by treating each split function as one hot section. A more architecture-agnostic approach for this has been applied. However, unwind info generated for cold sections has yet to be tested, as this depends on VM support for hot/cold splitting.