i#3044: Add a type to represent aarch64 vectors #5681

joshua-warburton · 2022-10-11T09:16:27Z

Previously in the IR we have represented vectors by a plain H, S, D or Q register combined with a faux-imm added after the last vector to give a hint to its size.

This patch uses the .size field of the opnd struct to include the element size, similar to how partial registers are used for x86. Element Vector registers are differentiated from this by setting the DR_OPND_IS_VECTOR (0x40) bit on the .flag field. This bit is also set by the DR_OPND_IS_EXTEND flag for imms and was chosen as to not extend the size of the flag field while remaining unambiguous.

The new Element Vector operand is printed like z0.b via a check in disassemble_shared.c to set a suffix. Also added are the utility functions opnd_is_element_vector_reg,
opnd_create_reg_element_vector and opnd_get_vector_element_size/

issue: #3044

AssadHashmi · 2022-10-11T11:07:29Z

Hi @derekbruening, I'm happy with this patch to improve the vector<->element binding for AArch64 SVE instructions. Let us know if you have any objections. We would like this to be the format from SVE onwards. Retro-fitting to v8.0/1/2 SIMD (NEON) may happen later.

derekbruening · 2022-10-11T19:16:26Z

Hmm, I don't think we've discussed element sizes before: the #3044 discussions were about the mcontext fields.

IIRC, previously the element size was not in the IR at all (for any arch). The size field could be smaller than the register capacity but that meant a sub-reg operand where other bits were left unchanged. Tools that needed the element size (e.g., Dr. Memory) dispatched on opcode.

For this proposal in this PR, my question is: what about an operand that needs both an element size and a sub-register size? IIRC some of the x86 vector operations would leave some bits unchanged and I think the x86 decoder marks those as sub-register (I assume these are not just leaving exactly half unchanged as there we could use the smaller vector register name? or maybe we go w/ the name in the ISA convention b/c other operands are on that full size even when it's half...might have to refresh my memory). Maybe this doesn't happen for SVE? But if we wanted to adopt this for other ISA's it could be an issue there. Not sure what the solution is though other than finding other bits to store the element size.

Maybe taking a step back and seeing whether the sub-reg size is always useful is worthwhile: is it always the bottom bits that are used and the top that are unused? If not then tools have to dispatch on opcode anyway. Are there identical opcodes that differ only by sub-reg size and then a dispatch is not possible.

derekbruening · 2022-10-11T22:44:58Z

Looking for more opinions: @abhinav92003

joshua-warburton · 2022-10-12T14:20:25Z

For context, I do not believe partial registers are used anywhere in aarch64.

derekbruening · 2022-10-12T19:24:37Z

An x86 example:
AMD's VFMADDSS instruction which takes the lower 32 bits of several source XMM registers and adds them. In the destination the top bits are zeroed, but for the sources the top bits are unchanged.
So DR says the sources are XMM registers with size OPSZ_4_of_16

derekbruening · 2022-10-13T18:30:58Z

If we don't want to increase the size of the passed-by-value opnd_t, can we fit the vector element size in currently-unused-for-register-types bits? The vector element size should be just one of a handful of possibilities. Then we can support both element size and total size. I guess we'd need new accessors to get the element size.

joshua-warburton · 2022-10-14T12:00:08Z

There doesn't appear to be anything that the element size could be union'd with to fulfil those requirements.

However, there is likely some padding space after the first union, kind(8) + size(8) + union of shorts(16) leaves another 32 bits before the 64-bit aligned second union. Putting an element size in there wouldn't increase the overall size of the struct.

derekbruening · 2022-10-17T02:36:32Z

For REG_kind for 64-bit pointers, opnd_t.value is only using 16 of its 32 bits (for reg_id_t), so we could add a size for REG_kind there?

joshua-warburton · 2022-10-17T10:47:42Z

Implemented the above in a new patch.

Previously in the IR we have represented vectors by a plain H, S, D or Q register combined with a faux-imm added after the last vector to give a hint to its size. This patch uses the .size field of the opnd struct to include the element size, similar to how partial registers are used for x86. Element Vector registers are differentiated from this by setting the DR_OPND_IS_VECTOR (0x40) bit on the .flag field. This bit is also set by the DR_OPND_IS_EXTEND flag for imms and was chosen as to not extend the size of the flag field while remaining unambiguous. The new Element Vector operand is printed like z0.b via a check in disassemble_shared.c to set a suffix. Also added are the utility functions opnd_is_element_vector_reg, opnd_create_reg_element_vector and opnd_get_vector_element_size/ issue: #3044 Change-Id: I0cdb78cc1a13c3db7b6c742fb1d5d5c0e54216ff

Change-Id: I00c846594c0111c9c67741effd62f25c1d21e725

Change-Id: Ie0453e0a5a2e89a1b3cc7e270934f572f4c07e54

Change-Id: Ie3bec227e9478a5c1ce9eff6a64db9628697668d

Change-Id: I017a8716497e33742f092d8b2d55795cd96896fb

Change-Id: Ibb3a8845a286dfb55f1499af06261292ad03891c

Change-Id: I76609940bb34de4c43e821c28e453b0e75936bb2

Change-Id: I9c6f3a710db1cb29324589d2abefdb0c61879936

Change-Id: I7a921aa497e9c53146faff8f7d58cc79159488b1

joshua-warburton · 2022-10-17T16:20:50Z

run arm tests

derekbruening · 2022-10-18T05:58:04Z

Turns out there is an issue filed on adding element widths to x86 SIMD operands: #5638

joshua-warburton · 2022-10-18T10:01:18Z

I believe this is now ready for review, with the suggested change to the storage location @derekbruening @AssadHashmi

AssadHashmi

Looks good. Please add 2 tests to verify the edge cases where an instruction has at least one operand which is a:

Full SVE vector, i.e. no element size qualification.
SIMD/NEON vector (there are SVE instructions which access the lower 128 bits as SIMD/NEON vectors).

Best to avoid memory instruction tests for the above as they will unnecessarily complicate the patch, any simple arithmetic/logic instructions should be fine.

core/ir/disassemble_shared.c

core/ir/instr_inline_api.h

core/ir/opnd_api.h

core/ir/opnd_shared.c

core/ir/aarch64/instr_create_api.h

core/ir/opnd_api.h

AssadHashmi · 2022-10-19T09:59:22Z

While reviewing the disassembly changes I noticed an error in reg_names[] for SVE register strings. There's a q3 which should be z3:

dynamorio/core/ir/aarch64/encode.c

Line 112 in 1e82e56

"z0", "z1", "z2", "q3", "z4", "z5", "z6", "z7", "z8", "z9",

Please put the fix in this patch, thanks.

Change-Id: I20a70b53f4c50f4b4341fdc9fca1ba2a2131ee8a

Change-Id: I66820e4994492784c947b8adf18ba1516c01b122

Change-Id: I5724b909bb13a5b17b213f41fa1e6b6921ffae76

joshua-warburton · 2022-10-20T16:06:37Z

I believe this is all the resolvable review comments resolved.

Change-Id: I03ff7febdc0ab6477d0e1a14310ca54a8ce1cb52

Change-Id: Ia536d9a3fc5115915dd17d78c64f9e644750ae47

Change-Id: I0f6be678faf4f0d9f4d7f3c682e636875cbf3635

derekbruening · 2022-10-22T14:35:52Z

Also added are the utility functions opnd_is_element_vector_reg,
opnd_create_reg_element_vector and opnd_get_vector_element_size/

Looks like these new functions were not added to the release notes for new features.

joshua-warburton requested review from derekbruening and AssadHashmi October 11, 2022 10:27

joshua-warburton added 6 commits October 17, 2022 14:10

clang format

5aa7621

Change-Id: I00c846594c0111c9c67741effd62f25c1d21e725

fix x86 assert

43f2b5d

Change-Id: Ie0453e0a5a2e89a1b3cc7e270934f572f4c07e54

more x86 asert fixes

7e70941

Change-Id: Ie3bec227e9478a5c1ce9eff6a64db9628697668d

Store size in struct with the reg_id

ccb4cef

Change-Id: I017a8716497e33742f092d8b2d55795cd96896fb

clang format and element_size type

678a2c3

Change-Id: Ibb3a8845a286dfb55f1499af06261292ad03891c

joshua-warburton force-pushed the i3044-add-element-vector branch from 3bdfd18 to 678a2c3 Compare October 17, 2022 13:11

joshua-warburton added 2 commits October 17, 2022 14:26

reapply the bit size fix for p10_low

ee086d8

Change-Id: I76609940bb34de4c43e821c28e453b0e75936bb2

add missing element vector check

332d3f8

Change-Id: I9c6f3a710db1cb29324589d2abefdb0c61879936

joshua-warburton force-pushed the i3044-add-element-vector branch from 78e0513 to 332d3f8 Compare October 17, 2022 15:27

init element size for partial regs

3cd19d0

Change-Id: I7a921aa497e9c53146faff8f7d58cc79159488b1

derekbruening mentioned this pull request Oct 18, 2022

API for getting lane width of x86 SIMD operands #5638

Open

Merge branch 'master' into i3044-add-element-vector

2cec1b1

AssadHashmi requested changes Oct 18, 2022

View reviewed changes

derekbruening approved these changes Oct 18, 2022

View reviewed changes

derekbruening reviewed Oct 18, 2022

View reviewed changes

core/ir/opnd_api.h Show resolved Hide resolved

joshua-warburton added 5 commits October 19, 2022 16:47

address review comments

d735c9d

Change-Id: I20a70b53f4c50f4b4341fdc9fca1ba2a2131ee8a

Merge branch 'master' into i3044-add-element-vector

1b44b6d

clang format

14a11de

Change-Id: I66820e4994492784c947b8adf18ba1516c01b122

Merge branch 'master' into i3044-add-element-vector

3908f94

Final review comments addressed

10c4fcd

Change-Id: I5724b909bb13a5b17b213f41fa1e6b6921ffae76

joshua-warburton added 3 commits October 20, 2022 17:13

Fix x86 build error

097d130

Change-Id: I03ff7febdc0ab6477d0e1a14310ca54a8ce1cb52

Fix x86 compiler warnings

8bf461b

Change-Id: Ia536d9a3fc5115915dd17d78c64f9e644750ae47

Clang format

30dd83b

Change-Id: I0f6be678faf4f0d9f4d7f3c682e636875cbf3635

AssadHashmi approved these changes Oct 21, 2022

View reviewed changes

joshua-warburton merged commit 13ff46c into master Oct 21, 2022

joshua-warburton deleted the i3044-add-element-vector branch October 21, 2022 10:32

This was referenced Feb 13, 2023

Add encoder/decoder support for Arm's Scalable Vector Extension #3044

Open

i#3044 AArch64 SVE codec: Add ADR instructions #5866

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

i#3044: Add a type to represent aarch64 vectors #5681

i#3044: Add a type to represent aarch64 vectors #5681

joshua-warburton commented Oct 11, 2022

AssadHashmi commented Oct 11, 2022

derekbruening commented Oct 11, 2022

derekbruening commented Oct 11, 2022

joshua-warburton commented Oct 12, 2022

derekbruening commented Oct 12, 2022

derekbruening commented Oct 13, 2022

joshua-warburton commented Oct 14, 2022

derekbruening commented Oct 17, 2022

joshua-warburton commented Oct 17, 2022

joshua-warburton commented Oct 17, 2022

derekbruening commented Oct 18, 2022

joshua-warburton commented Oct 18, 2022

AssadHashmi left a comment

AssadHashmi commented Oct 19, 2022

joshua-warburton commented Oct 20, 2022

derekbruening commented Oct 22, 2022

i#3044: Add a type to represent aarch64 vectors #5681

i#3044: Add a type to represent aarch64 vectors #5681

Conversation

joshua-warburton commented Oct 11, 2022

AssadHashmi commented Oct 11, 2022

derekbruening commented Oct 11, 2022

derekbruening commented Oct 11, 2022

joshua-warburton commented Oct 12, 2022

derekbruening commented Oct 12, 2022

derekbruening commented Oct 13, 2022

joshua-warburton commented Oct 14, 2022

derekbruening commented Oct 17, 2022

joshua-warburton commented Oct 17, 2022

joshua-warburton commented Oct 17, 2022

derekbruening commented Oct 18, 2022

joshua-warburton commented Oct 18, 2022

AssadHashmi left a comment

Choose a reason for hiding this comment

AssadHashmi commented Oct 19, 2022

joshua-warburton commented Oct 20, 2022

derekbruening commented Oct 22, 2022