-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add BITALG and full VPOPCNTDQ instruction support #234
Conversation
Codecov Report
@@ Coverage Diff @@
## master #234 +/- ##
=======================================
Coverage 75.92% 75.92%
=======================================
Files 65 65
Lines 20694 20719 +25
=======================================
+ Hits 15711 15731 +20
- Misses 4901 4906 +5
Partials 82 82
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
Supporting extra instructions not included in the Opcodes database is currently a challenge. Short of migrating to an entirely different source (such as #23), the options are either to patch the XML data file or to append additional instructions at the loading phase. An example of patching the XML was shown in the as-yet unlanded PR #234. This shows the XML patching approach is unwieldy and requires more information than we actually need (for example instruction form encodings). In #335 we discussed the alternative of adding extra instructions during loading. This has the advantage of using avo's simpler internal data structure. This PR prepares for using that approach by adding an `internal/opcodesextra` package, intended to contain manually curated lists of extra instructions to add to the instruction database during loading. At the moment, the only instruction added here is the `MOVLQZX` instruction that's already handled this way. Updates #335 #234 #23
@mmcloughlin This PR is also now fully merged with upstream and ready to go. I took inspiration from the GFNI PR #344 and backed-out the editing of There was only one wrinkle with this approach, but in the end it seems fine: because |
Adds the VPOPCNTDQ instruction set, providing packed population count for double and quadword integers. These are added via the `opcodesextra` mechanism #345, since they're missing from the opcodes database. In this case the 512-bit non-AVX512VL forms are added here as well as the opcodes database, but they're deduplicated later. Contributed by @vsivsi. Extracted from #234 with simplifications for AVX-512 form expansion. Co-authored-by: Vaughn Iverson <[email protected]>
Adds the AVX-512 Bit Algorithms instruction set. These new instructions are added via the `opcodesextra` mechanism #345, since they're missing from the opcodes database. Contributed by @vsivsi. Extracted from #234 with simplifications for AVX-512 form expansion. Co-authored-by: Vaughn Iverson <[email protected]>
See discussion here: #199 (comment)