Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle more than 64 registers - Part 3 #102592

Merged
merged 44 commits into from
May 24, 2024

Conversation

kunalspathak
Copy link
Member

@kunalspathak kunalspathak commented May 23, 2024

  • This PR basically adds a high field in regMaskTP, but do not use it.
    EDIT: I will do this as a separate PR to isolate TP impact.
  • Introduce a new type SingleTypeRegSet that denotes a 8-byte register set to represent registers from a single type i.e. gpr or float and in future predicated registers.
  • Updated various places in LSRA that handles transition from regMaskTP and SingleTypeRegSet. Basically, all the fields in RefPosition, Interval, RegisterSelection are updated to be SingleTypeRegSet because they can never have mixed register set.

@kunalspathak kunalspathak marked this pull request as ready for review May 24, 2024 17:51
@kunalspathak
Copy link
Member Author

@dotnet/jit-contrib

@kunalspathak
Copy link
Member Author

Diffs

image
image

Copy link
Member

@EgorBo EgorBo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me from my limitted understanding

@kunalspathak
Copy link
Member Author

@jakobbotsch - I am building things on top of this PR, so will go ahead and merge. Feel free to give a review once you are back. Thanks @EgorBo !

@kunalspathak kunalspathak merged commit ce1477b into dotnet:main May 24, 2024
105 of 107 checks passed
Comment on lines +359 to +365
static regMaskTP operator<<(regMaskTP& first, const int b)
{
regMaskTP result(first.getLow() << b);
return result;
}

static regMaskTP operator>>(regMaskTP& first, const int b)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These shouldn't be taking first by (mutable) reference since they aren't the compound assignment versions. I'd suggest we just make it by value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I changed bunch of these but forgot to change for << and >>. Will fix in next PR.

@@ -2493,7 +2498,7 @@ class RefPosition
// Prior to the allocation pass, registerAssignment captures the valid registers
// for this RefPosition.
// After the allocation pass, this contains the actual assignment
regMaskTP registerAssignment;
SingleTypeRegSet registerAssignment;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can contain multiple different register types for RefTypeKill. I think there are plenty of unused fields in RefPosition for RefTypeKill to fit a union that has a full regMaskTP.

Is it possible to double check that the other changed register sets in this PR don't also suffer from this problem? For example, some asserts somewhere to validate that SingleTypeRegSet don't have both float and int registers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can contain multiple different register types for RefTypeKill

I realized that while working on this PR, but wanted to have minimal changes to keep things unblocked. I will run into this problem anyway when I add predicate registers, because now, the registerAssignment for RefTypeKill need to technically hold more than 64 registers. One of the option as you suggested is to union some of the unused fields for RefTypeKill or convert SingleTypeRegSet to 64RegsSet data type (same concept but different name to identify that former just had register set of one type vs. the newer one has 64 registers in a set of mixed type), but with the later approach, we might end up creating 2 RefTypeKill RefPosition - first for gpr/float and second for predicate. I will see which one is better approach and go with the minimalistic change needed at this point to unblock the predicate register implementation.

Is it possible to double check that the other changed register sets in this PR don't also suffer from this problem

There may be and the plan is to fix them up when I add predicate register, because then we will have to differentiate between which register type we are touching.

@MichalStrehovsky
Copy link
Member

This change broke nativeaot outerloop runs on arm32: #102715.

Comment on lines -11259 to +11261
regMaskTP rbmAllFloat;
regMaskTP rbmFltCalleeTrash;
unsigned cntCalleeTrashFloat;
SingleTypeRegSet rbmAllFloat;
SingleTypeRegSet rbmFltCalleeTrash;
unsigned cntCalleeTrashFloat;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why these changes? I am not sure I'm sold that these cannot just stay as regMaskTP.

@@ -742,15 +739,15 @@ inline bool floatRegCanHoldType(regNumber reg, var_types type)

extern const regMaskSmall regMasks[REG_COUNT];

inline regMaskTP genRegMask(regNumber reg)
inline SingleTypeRegSet genRegMask(regNumber reg)
Copy link
Member

@jakobbotsch jakobbotsch May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sold on changing this from regMaskTP to SingleTypeRegSet. It is going to introduce ambiguity in the future about what register is being referred to.

As we discussed, I think we should isolate this handling to LSRA. LSRA can have its own context-dependent register sets where the first GP, float or predicate register all get mapped to 0, while the rest of the JIT can keep using the large register sets. LSRA can handle mapping these back to regNumber or regMaskTP (in the rare cases) to interact with the rest of the JIT.

For instrDesc we can stop storing regMaskTP in instrDescCGCA and just have an accessor method that maps the internal storage back to regMaskTP.

In my opinion, SingleTypeRegSet should be defined inside LSRA code (e.g. lsra.h or lsra.cpp) and its use should be isolated to inside LSRA.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, addressed in #102783

Ruihan-Yin pushed a commit to Ruihan-Yin/runtime that referenced this pull request May 30, 2024
* Add `high` in regMaskTP

* Introduce SingleTypeRegSet

* Use SingleTypeRegSet in few places

* Delete some methods in regMaskTP

* Delete some more methods of regMaskTP

* Fix actualRegistersMask

* Use SingleTypeRegSet in consecutive register code

* Use SingleTypeRegSet in consecutive registers codebase

* Change genRegMask*() method to return SingleTypeRegSet

* wip

* another wip

* Everything except newRefPosition/killMask

* refactor code around buildkill

* fix build errors

* some more errors

* jit format

* fixed build error for arm64

* REVERT: temporary add #ifdef TARGET_ARM64 for accessing regMaskTP methods

* forgot to add the new file

* make addRegsForKill only on low

* jit format

* Revert "REVERT: temporary add #ifdef TARGET_ARM64 for accessing regMaskTP methods"

This reverts commit 325bc6e.

* Various fixes after merge

* passing arm64 build

* clrjit build works

* clrjit_universal_arm_x64 build works

* clrjit_unix_x64_x64 build works

* clrjit_win_x86_x64 build works

* fix a bug in size

* delete unwanted method

* jit format

* Remove high

* Continue using regMaskTP for NodeInternalRegisters

* Pass regType to getConstrainedRegMask()

* jit format

* fix a wrong parameter for consecutive register

* fix riscv64 build errors

* jit format
@github-actions github-actions bot locked and limited conversation to collaborators Jun 28, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants