-
Notifications
You must be signed in to change notification settings - Fork 561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AArch64: Fix P register save/restore on 128-bit vector length systems #6760
Comments
AssadHashmi
changed the title
AArch64: Use smaller data types for SVE predicate and FFR registers
AArch64: use smaller data types for SVE predicate and FFR registers
Apr 5, 2024
jackgallagher-arm
added a commit
that referenced
this issue
Apr 12, 2024
PR #6757 fixed the way we read/write SVE register slots but unfortunately it is now broken on systems with 128-bit vector length. Both SVE vector and predicate registers use dr_simd_t slots which is a 64-byte type meant to store up to 512-bit vector registers. SVE predicate registers are always 1/8 the size of the vector register so for 512-bit vector length systems we only really need 64 / 8 = 8 bytes to store predicate registers. The ldr/str instructions we use to read and write the predicate register slots have a base+offset memory operand where the offset is a value in the range [-256, 255] scaled based by predicate register length. We read and write the registers by setting the base address to the address of the first slot, and setting the offset to n * sizeof(dr_simd_t) for each register Pn. For systems with 128-bit vector length, this means the predicate registers are 16 / 8 = 2 bytes so the maximum offset we can reach is 2 * 255 = 510 bytes. This means on 128-bit VL systems we can only reach the first 8 predicate registers (8 * sizeof(dr_simd_t) = 512). By changing the predicate register and FFR slots to use a new type dr_svep_t which is 1/8 the size of dr_simd_t we can fix this bug and save space. dr_svep_t is currently 8 bytes to correspond to 64 byte vectors, but even if we extend DynamoRIO to support the maximum SVE vector length of 2048-bits (256 bytes) dr_svep_t will only need to be increased to 256 / 8 = 32 bytes so the maximum offset (15 * 32 = 480 bytes) will always be in range. As this changes the size of the predicate register and FFR slots, this changes the size of the dr_mcontext_t structure and breaks backwards compatibility with earlier versions of DynamoRIO so the version number is increased to 10.90. Issues: #6760, #5365 Fixes: #6760
jackgallagher-arm
changed the title
AArch64: use smaller data types for SVE predicate and FFR registers
AArch64: Fix P register save/restore on 128-bit vector length systems
Apr 15, 2024
jackgallagher-arm
added a commit
that referenced
this issue
Apr 15, 2024
…6774) PR #6757 fixed the way we read/write SVE register slots but unfortunately it is now broken on systems with 128-bit vector length. Both SVE vector and predicate registers use dr_simd_t slots which is a 64-byte type meant to store up to 512-bit vector registers. SVE predicate registers are always 1/8 the size of the vector register so for 512-bit vector length systems we only really need 64 / 8 = 8 bytes to store predicate registers. The ldr/str instructions we use to read and write the predicate register slots have a base+offset memory operand where the offset is a value in the range [-256, 255] scaled based by predicate register length. We read and write the registers by setting the base address to the address of the first slot, and setting the offset to n * sizeof(dr_simd_t) for each register Pn. For systems with 128-bit vector length, this means the predicate registers are 16 / 8 = 2 bytes so the maximum offset we can reach is 2 * 255 = 510 bytes. This means on 128-bit VL systems we can only reach the first 8 predicate registers (8 * sizeof(dr_simd_t) = 512). By changing the predicate register and FFR slots to use a new type dr_svep_t which is 1/8 the size of dr_simd_t we can fix this bug and save space. dr_svep_t is currently 8 bytes to correspond to 64 byte vectors, but even if we extend DynamoRIO to support the maximum SVE vector length of 2048-bits (256 bytes) dr_svep_t will only need to be increased to 256 / 8 = 32 bytes so the maximum offset (15 * 32 = 480 bytes) will always be in range. As this changes the size of the predicate register and FFR slots, this changes the size of the dr_mcontext_t structure and breaks backwards compatibility with earlier versions of DynamoRIO so the version number is increased to 10.90. Issues: #6760, #5365 Fixes: #6760
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Currently predicate and FFR state is stored in
dr_simd_t
.This wastes space as predicates and FFR are an 1/8th of the size of the implementation's scalable vector register length.
Issue raised during review of #6757.
SVE master issue #5365.
The text was updated successfully, but these errors were encountered: