Z : Enable Vector instructions for handling float data types #7130

sarwat12 · 2023-10-03T16:17:56Z

This PR addresses the use of vector instructions to handle short format in the select evaluator.
Previously, the use of vector instructions for short format in the select evaluator was disabled, even though on z14 and newer platforms it is supported. The issue was caused by not correctly converting the condition code from GPR to FPR for short format. Changes for enabling vector instructions for short format:

Use of LLGFR instruction for long format for zero-extending a 32 bit conditionReg to 64 bits
Use of separate SLLG instruction for short format to preserve the float representation of the first 32 bits as it is later moved into FPR
Addition of mask values in the VFCE instruction to get the element size mask for floats and doubles respectively

Closes: #5002

Signed-off-by: Sarwat Shaheen [email protected]

github-actions

Thank you for supporting the project, and congratulations on your first contribution! A project committer will shortly review your contribution. In the mean time, if you haven't had a chance please skim over the contribution guidelines which all pull requests must adhere to. If the ECA pull request check fails, have a look at the instructions for signing the ECA in the legal considerations section.

If you run into any problems our community will be happy to assist you in any way we can. There are a number of recommended ways to interact with the community. We encourage you to ask questions, or drop by to say hello.

r30shah · 2023-10-11T15:53:09Z

compiler/z/codegen/ControlFlowEvaluator.cpp

@@ -2680,7 +2680,7 @@ OMR::Z::TreeEvaluator::dselectEvaluator(TR::Node *node, TR::CodeGenerator *cg)
      generateRRInstruction(cg, TR::InstOpCode::LDGR, node, tempReg, conditionReg);
      // generate compare with zero
      generateVRIaInstruction(cg, TR::InstOpCode::VGBM, node, vzeroReg, 0, 0);
-      generateVRRcInstruction(cg, TR::InstOpCode::VFCE, node, vectorSelReg, tempReg, vzeroReg, 1, 0, 3);
+      generateVRRcInstruction(cg, TR::InstOpCode::VFCE, node, vectorSelReg, tempReg, vzeroReg, 0, 0, 2);


This is incorrect. You are basically forcing the VFCE instruction with short-format (Float) so it will generate functionally incorrect code for Double type.

You can use getVectorElementSizeMask(TR::Node *node) to get m4 mask.

Also given that operation should only take place for first element, I wonder if we should set in the m5 to 0x8 which would control the operation to happen only on 0th element , which will be FPR value loaded in LDGR.

r30shah · 2023-10-25T17:48:47Z

compiler/z/codegen/ControlFlowEvaluator.cpp

+      if (node->getOpCode().isDouble())
+       {
+        generateRRInstruction(cg, TR::InstOpCode::LLGFR, node, conditionReg, conditionReg);
+       }else  


Tabs/ Space needed to be fixed

r30shah · 2023-10-25T17:50:37Z

compiler/z/codegen/ControlFlowEvaluator.cpp

+       {
+        generateRRInstruction(cg, TR::InstOpCode::LLGFR, node, conditionReg, conditionReg);
+       }else  
+       {


Please comment why we needed to left shit the condition node for float. This will help others looking into the code.

Given the case when the node opcode is a Float Select, and the condition child is 32 bits, an SLLG instruction would be required to shift to the left the 32 least significant bits within the 64 bit conditionReg GPR.
This would preserve the floating point representation of the conditionNode, as it is later moved into an FPR using LDGR, which would enable further floating-point operations to be performed.

r30shah · 2023-10-25T17:51:05Z

compiler/z/codegen/ControlFlowEvaluator.cpp

      // convert to floating point
      generateRRInstruction(cg, TR::InstOpCode::LDGR, node, tempReg, conditionReg);
      // generate compare with zero
      generateVRIaInstruction(cg, TR::InstOpCode::VGBM, node, vzeroReg, 0, 0);
-      generateVRRcInstruction(cg, TR::InstOpCode::VFCE, node, vectorSelReg, tempReg, vzeroReg, 1, 0, 3);
+      generateVRRcInstruction(cg, TR::InstOpCode::VFCE, node, vectorSelReg, tempReg, vzeroReg, 0, 0x8, getVectorElementSizeMask(node->getSize()));


Please comment the purpose behind mask values here.

For the VFCE instruction, the following mask values have been used to generate vector instructions for both doubles and floats.

M4 - Floating-point-format control = getVectorElementSizeMask(node->getSize()); returns 2 for short format floats and 3 for long format doubles accordingly, depending on the node.

M5 - Single-Element-Control = 0x8, which sets the bit 0 to one, which would control the operation to take place only on the zero-indexed element in the vector.

M6 - Condition Code Set = 0, the Condition Code is not set and remains unchanged.

Binary Encoding for Double:

0x2aa2ea3f8a0] b9 16 00 00 LLGFR GPR0,GPR0 0x2aa2ea3f950] b3 c1 00 20 LDGR FPR2,GPR0 0x2aa2ea3fa00] e7 00 00 00 08 44 VGBM VRF16,0x0 0x2aa2ea3fad0] e7 02 00 08 3a e8 WFCEDB VRF16,VRF2,VRF16

Binary encoding for Float:

0x2aa3be9ccf0] eb 00 00 20 00 0d SLLG GPR0,32 0x2aa3be9cdb0] b3 c1 00 20 LDGR FPR2,GPR0 0x2aa3be9ce60] e7 00 00 00 08 44 VGBM VRF16,0x0 0x2aa3be9cf30] e7 02 00 08 2a e8 WFCEDB VRF16,VRF2,VRF16

Thanks, I meant add comments in the code.

r30shah

Code changes looks good to me, @sarwat12 Please add comments in the code and also squash the commits.

r30shah

Minor nitpick, Overall LGTM, once you make final change, will give quick look and launch tests

r30shah · 2023-11-02T14:23:20Z

compiler/z/codegen/ControlFlowEvaluator.cpp

@@ -2669,18 +2669,30 @@ OMR::Z::TreeEvaluator::dselectEvaluator(TR::Node *node, TR::CodeGenerator *cg)
   TR::Register *resultReg = cg->gprClobberEvaluate(trueValueNode);
   TR::Register *conditionReg = cg->evaluate(conditionNode);
   TR::Register *falseValReg = cg->evaluate(falseValueNode);
-   if (cg->comp()->target().cpu.isAtLeast(OMR_PROCESSOR_S390_Z13) && node->getOpCode().isDouble())
+   if ((cg->comp()->target().cpu.isAtLeast(OMR_PROCESSOR_S390_Z13) && node->getOpCode().isDouble()) || (cg->comp()->target().cpu.isAtLeast(OMR_PROCESSOR_S390_Z14) && node->getOpCode().isFloat()))


Can you put the two condition in seperate line, so it is easy to read ?

if ((cg->comp()->target().cpu.isAtLeast(OMR_PROCESSOR_S390_Z13) && node->getOpCode().isDouble()) || (cg->comp()->target().cpu.isAtLeast(OMR_PROCESSOR_S390_Z14) && node->getOpCode().isFloat()))

r30shah

@sarwat12 As your PR closes #5002, can you link two. Please checkout https://github.com/eclipse/omr/blob/master/CONTRIBUTING.md#commit-guidelines for the commit guidelines.

r30shah · 2023-11-06T16:03:51Z

@sarwat12 As explained in https://github.com/eclipse/omr/blob/master/CONTRIBUTING.md#commit-guidelines, it is a commit message, not comment on the PR.

Your PR does the following, so it should be clearly mentioned in the commit header and body should contain detailed description.

Header

Body
...

Closes: https://github.com/eclipse/omr/issues/5002

Signed-off-by: ...

Also please have clear commit description in the header and have details in the commit body describing what was the issue and how it was fixed.

r30shah · 2023-11-06T16:53:24Z

Enable vector instructions for floats & doubles in the select evaluator

We were taking advantage of Vector Instructions for dselect already on the platform where long format in vector instructions is supported. Your PR addresses the issue seen with using it for short format (in fselect evaluator) on z14 and newer platform (Where short-format support for certain vector operation was added).

r30shah · 2023-11-06T23:07:07Z

@sarwat12 Can you rebase your branch ? Currently it shows 16 commits in this PR which is incorrect

r30shah · 2023-11-07T20:23:59Z

Jenkins build zos,zlinux

This commit addresses the use of vector instructions to handle short format in the **select** evaluator. Previously, the use of vector instructions for short format in the select evaluator was disabled, even though on z14 and newer platforms, it is supported. The issue was caused by not correctly converting the condition code from GPR to FPR for short format. Changes for enabling vector instructions for short format: - Use of LLGFR instruction for long format for zero-extending a 32 bit conditionReg to 64 bits - Use of separate SLLG instruction for short format floats to preserve the float representation of the first 32 bits as it is later moved into FPR - Addition of mask values in the VFCE instruction to get the element size mask for floats and doubles respectively Closes: eclipse#5002 Signed-off-by: Sarwat Shaheen [email protected]

r30shah · 2023-11-23T20:23:04Z

Jenkins build zos,zlinux

r30shah · 2023-11-24T14:27:15Z

@sarwat12 Can you post the build number of your internal jenkins build testing this change ?

sarwat12 · 2023-11-24T16:07:52Z

Build #19439 for testing fselect branch.

r30shah · 2023-11-29T09:58:50Z

@sarwat12 Looking at the build, I see JDK11 and JDK8 passes. But due to some unrelated issues JDK17 failed. Can you launch another build with JDK17 only ?

sarwat12 · 2023-11-29T15:19:23Z

Build #19491, with JDK17 only.

r30shah · 2023-11-30T12:28:41Z

Thanks @sarwat12 for doing the necessary sanity tests internally. I have checked out the builds from #7130 (comment) and #7130 (comment) and apart from a failure in cmdLineTester_criu_jitserverAcrossCheckpoint_0 which looking at the failure does not look like related to changes in this PR, everything looks ok to me.

@dsouzai failure in the criu_jitserver test looks like following,

[ERR] Assertion failed at /home/jenkins/workspace/Build_JDK17_s390x_linux_Personal/openj9/runtime/compiler/env/JITServerPersistentCHTable.cpp:172: classInfo
 [ERR] 	subclass info cannot be null: ensure subclasses are loaded before superclass

Failing Test

Do you know if this is known issue or something new?

dsouzai · 2023-11-30T14:50:41Z

Do you know if this is known issue or something new?

I think it's this issue eclipse-openj9/openj9#17474

dsouzai

LGTM, Approving based on @r30shah's approval.

dsouzai · 2023-11-30T15:19:37Z

Merging since tests have passed.

sarwat12 requested a review from fjeremic as a code owner October 3, 2023 16:17

github-actions bot added arch:z comp:compiler labels Oct 3, 2023

github-actions bot reviewed Oct 3, 2023

View reviewed changes

github-actions bot added the first contribution label Oct 3, 2023

r30shah suggested changes Oct 11, 2023

View reviewed changes

sarwat12 force-pushed the fselect branch from bb02201 to 207e70a Compare October 25, 2023 17:34

r30shah suggested changes Oct 25, 2023

View reviewed changes

sarwat12 requested review from charliegracie, youngar, babsingh, mstoodle, Leonardo2718, aviansie-ben, 0xdaryl, knn-k and vijaysun-omr as code owners October 25, 2023 18:17

sarwat12 force-pushed the fselect branch 2 times, most recently from fa6179f to edaa20d Compare October 26, 2023 13:46

r30shah approved these changes Oct 26, 2023

View reviewed changes

sarwat12 force-pushed the fselect branch from edaa20d to fb0af92 Compare October 26, 2023 16:17

r30shah reviewed Nov 2, 2023

View reviewed changes

sarwat12 force-pushed the fselect branch from fb0af92 to a8ea9c9 Compare November 6, 2023 14:15

r30shah approved these changes Nov 6, 2023

View reviewed changes

sarwat12 force-pushed the fselect branch from a8ea9c9 to ac6f559 Compare November 6, 2023 16:46

sarwat12 force-pushed the fselect branch 3 times, most recently from 21fd520 to 50340ce Compare November 6, 2023 19:37

sarwat12 force-pushed the fselect branch from 50340ce to 8977144 Compare November 6, 2023 19:55

sarwat12 force-pushed the fselect branch from 8977144 to 3d6f1fe Compare November 7, 2023 14:58

sarwat12 force-pushed the fselect branch from 3d6f1fe to 8bacfaf Compare November 23, 2023 19:15

dsouzai self-assigned this Nov 30, 2023

dsouzai approved these changes Nov 30, 2023

View reviewed changes

dsouzai merged commit 2d7eb9a into eclipse:master Nov 30, 2023
6 of 8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Z : Enable Vector instructions for handling float data types #7130

Z : Enable Vector instructions for handling float data types #7130

sarwat12 commented Oct 3, 2023 •

edited

Loading

github-actions bot left a comment

r30shah Oct 11, 2023

r30shah Oct 25, 2023

r30shah Oct 25, 2023

sarwat12 Oct 26, 2023 •

edited

Loading

r30shah Oct 25, 2023

sarwat12 Oct 25, 2023 •

edited

Loading

r30shah Oct 26, 2023

r30shah left a comment

r30shah left a comment

r30shah Nov 2, 2023

r30shah left a comment

r30shah commented Nov 6, 2023

r30shah commented Nov 6, 2023

r30shah commented Nov 6, 2023

r30shah commented Nov 7, 2023

r30shah commented Nov 23, 2023

r30shah commented Nov 24, 2023

sarwat12 commented Nov 24, 2023

r30shah commented Nov 29, 2023

sarwat12 commented Nov 29, 2023

r30shah commented Nov 30, 2023

dsouzai commented Nov 30, 2023

dsouzai left a comment

dsouzai commented Nov 30, 2023

Z : Enable Vector instructions for handling float data types #7130

Z : Enable Vector instructions for handling float data types #7130

Conversation

sarwat12 commented Oct 3, 2023 • edited Loading

github-actions bot left a comment

Choose a reason for hiding this comment

r30shah Oct 11, 2023

Choose a reason for hiding this comment

r30shah Oct 25, 2023

Choose a reason for hiding this comment

r30shah Oct 25, 2023

Choose a reason for hiding this comment

sarwat12 Oct 26, 2023 • edited Loading

Choose a reason for hiding this comment

r30shah Oct 25, 2023

Choose a reason for hiding this comment

sarwat12 Oct 25, 2023 • edited Loading

Choose a reason for hiding this comment

r30shah Oct 26, 2023

Choose a reason for hiding this comment

r30shah left a comment

Choose a reason for hiding this comment

r30shah left a comment

Choose a reason for hiding this comment

r30shah Nov 2, 2023

Choose a reason for hiding this comment

r30shah left a comment

Choose a reason for hiding this comment

r30shah commented Nov 6, 2023

r30shah commented Nov 6, 2023

r30shah commented Nov 6, 2023

r30shah commented Nov 7, 2023

r30shah commented Nov 23, 2023

r30shah commented Nov 24, 2023

sarwat12 commented Nov 24, 2023

r30shah commented Nov 29, 2023

sarwat12 commented Nov 29, 2023

r30shah commented Nov 30, 2023

dsouzai commented Nov 30, 2023

dsouzai left a comment

Choose a reason for hiding this comment

dsouzai commented Nov 30, 2023

sarwat12 commented Oct 3, 2023 •

edited

Loading

sarwat12 Oct 26, 2023 •

edited

Loading

sarwat12 Oct 25, 2023 •

edited

Loading