-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Random effect of Intel JCC Errata on micro optimizations #2405
Comments
We talked about this at our weekly maintainer meeting - there don't seem to be any great answers here. Someone needs to investigate:
We realize that asking for such investigation, before deciding whether to accept such changes, isn't great. 😹 😿 If it looks like there isn't a reasonably low-complexity way to get good optimization benefits without harming certain processors, then the path forward is probably just not to attempt such optimizations (i.e. nobody will complain if we just ship the classic loop and don't try to do magical things). |
Exactly how to detect CPUs that would benefit from vs. would be harmed by
|
There's also connection between this issue and
|
TL;DR: I think this issue should be closed without any further actions I experimented with this option and minmax benchmark after #4401 I added I know I have an affected CPU, but the option still makes the benchmark show a bit worse figures. My results
Looks like the option may be tuned for the best results, via officially unsupported variations, but it looks too much time consuming for a questionable gain. I can't explain why this option isn't helpful. From the disassembly I observe it works right. Maybe the impact is smaller than the impact of mitigation, or there was another microcode update to improve the situation. |
Sounds good to me. Thanks for looking into this. |
I encountered an issue in optimizing
vector_algorithm.cpp
.Suddenly with otherwise good optimizations I run into bad branching pattern that is affected by Intel JCC erratum. The impact matters up to reversing the effect of the optimization. And just random NOPs can make the optimization great again, as the issue is triggered by a bad alignment of branching instructions.
The compiler has the flag
/QIntel-jcc-erratum
. If I add it toCMakeFile.txt
, the optimizations start behaving predictable.I'm worried that this flag may impact other CPUs though, specifically older AMD CPUs, or Atom CPUs.
@fsb4000 confirmed that enabling
/QIntel-jcc-erratum
makes terrible perf on AMD FX 8300.I can try to make
vector_algorithm.cpp
into two translation units, one is compiled with/QIntel-jcc-erratum
and the other without, and that each function would branch into the/QIntel-jcc-erratum
translation unit in case of running on the affected CPU. This seems to be the best option, but it slightly impact binary size, and introduces a lot of complexity.What can we do here?
See the table in #2386 for example of the results. Note that enabling
/QIntel-jcc-erratum
both makes results better, and avoid unpredictable variations!The text was updated successfully, but these errors were encountered: