-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update instruction table to capture the correct state of EFlags #53806
Conversation
@dotnet/jit-contrib, @tannergooding |
src/coreclr/jit/instrsxarch.h
Outdated
|
||
// id nm um mr mi rm flags | ||
|
||
// Note that emitter has only partial support for BT. It can only emit the reg,reg form | ||
// and the registers need to be reversed to get the correct encoding. | ||
INST3(bt, "bt", IUM_RD, 0x0F00A3, BAD_CODE, 0x0F00A3, INS_FLAGS_WritesAllFlags) // PF = ?, AF = ?, ZF = ?, SF = ?, OF = ? | ||
INST3(bt, "bt", IUM_RD, 0x0F00A3, BAD_CODE, 0x0F00A3, Undefined_OF | Undefined_SF | Undefined_AF | Undefined_PF | Writes_CF ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is interesting and looks to possibly be an AMD vs Intel difference.
Intel specifies ZF is unaffected while AMD specifies it is undefined
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remember correctly, AMD had a different implementation of this instruction (and all the other BTx instruction) decades ago prior to/around the Athlon64. Since then the documentation declares these flags as being Undefined. All recent CPUs behave like Intel's. There were some issues with the gcc optimizer that assumed these flags didn't change, causing branching errors. That's why I remember.
On a side note, the BTx instructions have a nasty behavior when applied on memory and the index-register (or immediate value) is larger than the size of the destination. Then the destination address is taken as an array, and you can get out of bounds unaware. bts qword [rax], 0xC0 behaves like bts qword [rax+24], 0 -- I just mention it for safety reasons... I'm sure the JIT team is aware of it. Or as the code comment suggests, isn't using it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to keep it as Undefined_ZF
in that case, to be conservative. No optimization will take benefit from flags that are marked as Undefined
for a given instruction.
@BruceForstall or @AndyAyersMS - Can you take a look. This fixes a blocking outerloop issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question
The table looks fine, but I didn't review it thoroughly
Hello @kunalspathak! Because this pull request has the p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (
|
In #53214, I included the information of which EFlags gets updated per instruction. But for few instructions like
imul
anddiv
, the flags were wrong. Fix those entries. I also went through each instruction in Intel's manual "Appendix A" "EFlags Cross-Reference" . I also made sure that I capture all possible state of EFlags per instruction e.g.undefined
orrestore
so in future we can mitigate such issue.A request to reviewer to please cross check from the Intel manual to make sure if I didn't miss anything.
No asmdiffs in libraries/benchmarks. Only asmdiff in coreclr_tests.pmi is the test case that was failing.
Fixes: #53781