-
Notifications
You must be signed in to change notification settings - Fork 396
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Z : Enable Vector instructions for handling float data types #7130
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for supporting the project, and congratulations on your first contribution! A project committer will shortly review your contribution. In the mean time, if you haven't had a chance please skim over the contribution guidelines which all pull requests must adhere to. If the ECA pull request check fails, have a look at the instructions for signing the ECA in the legal considerations section.
If you run into any problems our community will be happy to assist you in any way we can. There are a number of recommended ways to interact with the community. We encourage you to ask questions, or drop by to say hello.
@@ -2680,7 +2680,7 @@ OMR::Z::TreeEvaluator::dselectEvaluator(TR::Node *node, TR::CodeGenerator *cg) | |||
generateRRInstruction(cg, TR::InstOpCode::LDGR, node, tempReg, conditionReg); | |||
// generate compare with zero | |||
generateVRIaInstruction(cg, TR::InstOpCode::VGBM, node, vzeroReg, 0, 0); | |||
generateVRRcInstruction(cg, TR::InstOpCode::VFCE, node, vectorSelReg, tempReg, vzeroReg, 1, 0, 3); | |||
generateVRRcInstruction(cg, TR::InstOpCode::VFCE, node, vectorSelReg, tempReg, vzeroReg, 0, 0, 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is incorrect. You are basically forcing the VFCE instruction with short-format (Float) so it will generate functionally incorrect code for Double type.
You can use getVectorElementSizeMask(TR::Node *node) to get m4
mask.
Also given that operation should only take place for first element, I wonder if we should set in the m5
to 0x8 which would control the operation to happen only on 0th element , which will be FPR value loaded in LDGR
.
if (node->getOpCode().isDouble()) | ||
{ | ||
generateRRInstruction(cg, TR::InstOpCode::LLGFR, node, conditionReg, conditionReg); | ||
}else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tabs/ Space needed to be fixed
{ | ||
generateRRInstruction(cg, TR::InstOpCode::LLGFR, node, conditionReg, conditionReg); | ||
}else | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please comment why we needed to left shit the condition node for float. This will help others looking into the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the case when the node opcode is a Float Select, and the condition child is 32 bits, an SLLG instruction would be required to shift to the left the 32 least significant bits within the 64 bit conditionReg GPR.
This would preserve the floating point representation of the conditionNode, as it is later moved into an FPR using LDGR, which would enable further floating-point operations to be performed.
// convert to floating point | ||
generateRRInstruction(cg, TR::InstOpCode::LDGR, node, tempReg, conditionReg); | ||
// generate compare with zero | ||
generateVRIaInstruction(cg, TR::InstOpCode::VGBM, node, vzeroReg, 0, 0); | ||
generateVRRcInstruction(cg, TR::InstOpCode::VFCE, node, vectorSelReg, tempReg, vzeroReg, 1, 0, 3); | ||
generateVRRcInstruction(cg, TR::InstOpCode::VFCE, node, vectorSelReg, tempReg, vzeroReg, 0, 0x8, getVectorElementSizeMask(node->getSize())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please comment the purpose behind mask values here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the VFCE instruction, the following mask values have been used to generate vector instructions for both doubles and floats.
M4 - Floating-point-format control = getVectorElementSizeMask(node->getSize()); returns 2 for short format floats and 3 for long format doubles accordingly, depending on the node.
M5 - Single-Element-Control = 0x8, which sets the bit 0 to one, which would control the operation to take place only on the zero-indexed element in the vector.
M6 - Condition Code Set = 0, the Condition Code is not set and remains unchanged.
Binary Encoding for Double:
0x2aa2ea3f8a0] b9 16 00 00 LLGFR GPR0,GPR0
0x2aa2ea3f950] b3 c1 00 20 LDGR FPR2,GPR0
0x2aa2ea3fa00] e7 00 00 00 08 44 VGBM VRF16,0x0
0x2aa2ea3fad0] e7 02 00 08 3a e8 WFCEDB VRF16,VRF2,VRF16
Binary encoding for Float:
0x2aa3be9ccf0] eb 00 00 20 00 0d SLLG GPR0,32
0x2aa3be9cdb0] b3 c1 00 20 LDGR FPR2,GPR0
0x2aa3be9ce60] e7 00 00 00 08 44 VGBM VRF16,0x0
0x2aa3be9cf30] e7 02 00 08 2a e8 WFCEDB VRF16,VRF2,VRF16
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I meant add comments in the code.
fa6179f
to
edaa20d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code changes looks good to me, @sarwat12 Please add comments in the code and also squash the commits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nitpick, Overall LGTM, once you make final change, will give quick look and launch tests
@@ -2669,18 +2669,30 @@ OMR::Z::TreeEvaluator::dselectEvaluator(TR::Node *node, TR::CodeGenerator *cg) | |||
TR::Register *resultReg = cg->gprClobberEvaluate(trueValueNode); | |||
TR::Register *conditionReg = cg->evaluate(conditionNode); | |||
TR::Register *falseValReg = cg->evaluate(falseValueNode); | |||
if (cg->comp()->target().cpu.isAtLeast(OMR_PROCESSOR_S390_Z13) && node->getOpCode().isDouble()) | |||
if ((cg->comp()->target().cpu.isAtLeast(OMR_PROCESSOR_S390_Z13) && node->getOpCode().isDouble()) || (cg->comp()->target().cpu.isAtLeast(OMR_PROCESSOR_S390_Z14) && node->getOpCode().isFloat())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you put the two condition in seperate line, so it is easy to read ?
if ((cg->comp()->target().cpu.isAtLeast(OMR_PROCESSOR_S390_Z13) && node->getOpCode().isDouble())
|| (cg->comp()->target().cpu.isAtLeast(OMR_PROCESSOR_S390_Z14) && node->getOpCode().isFloat()))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sarwat12 As your PR closes #5002, can you link two. Please checkout https://github.com/eclipse/omr/blob/master/CONTRIBUTING.md#commit-guidelines for the commit guidelines.
@sarwat12 As explained in https://github.com/eclipse/omr/blob/master/CONTRIBUTING.md#commit-guidelines, it is a commit message, not comment on the PR. Your PR does the following, so it should be clearly mentioned in the commit header and body should contain detailed description.
Also please have clear commit description in the header and have details in the commit body describing what was the issue and how it was fixed. |
We were taking advantage of Vector Instructions for dselect already on the platform where long format in vector instructions is supported. Your PR addresses the issue seen with using it for short format (in fselect evaluator) on z14 and newer platform (Where short-format support for certain vector operation was added). |
21fd520
to
50340ce
Compare
@sarwat12 Can you rebase your branch ? Currently it shows 16 commits in this PR which is incorrect |
Jenkins build zos,zlinux |
This commit addresses the use of vector instructions to handle short format in the **select** evaluator. Previously, the use of vector instructions for short format in the select evaluator was disabled, even though on z14 and newer platforms, it is supported. The issue was caused by not correctly converting the condition code from GPR to FPR for short format. Changes for enabling vector instructions for short format: - Use of LLGFR instruction for long format for zero-extending a 32 bit conditionReg to 64 bits - Use of separate SLLG instruction for short format floats to preserve the float representation of the first 32 bits as it is later moved into FPR - Addition of mask values in the VFCE instruction to get the element size mask for floats and doubles respectively Closes: eclipse#5002 Signed-off-by: Sarwat Shaheen [email protected]
Jenkins build zos,zlinux |
@sarwat12 Can you post the build number of your internal jenkins build testing this change ? |
Build #19439 for testing fselect branch. |
@sarwat12 Looking at the build, I see JDK11 and JDK8 passes. But due to some unrelated issues JDK17 failed. Can you launch another build with JDK17 only ? |
Build #19491, with JDK17 only. |
Thanks @sarwat12 for doing the necessary sanity tests internally. I have checked out the builds from #7130 (comment) and #7130 (comment) and apart from a failure in @dsouzai failure in the criu_jitserver test looks like following,
Do you know if this is known issue or something new? |
I think it's this issue eclipse-openj9/openj9#17474 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, Approving based on @r30shah's approval.
Merging since tests have passed. |
This PR addresses the use of vector instructions to handle short format in the select evaluator.
Previously, the use of vector instructions for short format in the select evaluator was disabled, even though on z14 and newer platforms it is supported. The issue was caused by not correctly converting the condition code from GPR to FPR for short format. Changes for enabling vector instructions for short format:
Closes: #5002
Signed-off-by: Sarwat Shaheen [email protected]