-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AVX512: Fold some bitwise operations to vpternlogq #84534
Comments
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak Issue DetailsE.g. bool Test(string s) => s == "https://pkgs.dev.azure.com/dnc"; Currently emits: ; Method Prog:Test(System.String):bool:this
C5F877 vzeroupper
4885D2 test rdx, rdx
7431 je SHORT G_M52811_IG05
837A081E cmp dword ptr [rdx+08H], 30
752B jne SHORT G_M52811_IG05
C5FC10420C vmovups ymm0, ymmword ptr[rdx+0CH]
C5FDEF0535000000 vpxor ymm0, ymm0, ymmword ptr[reloc @RWD00]
C5FC104A28 vmovups ymm1, ymmword ptr[rdx+28H]
C5F5EF0D48000000 vpxor ymm1, ymm1, ymmword ptr[reloc @RWD32]
C5FDEBC1 vpor ymm0, ymm0, ymm1
C4E27D17C0 vptest ymm0, ymm0
0F94C0 sete al
0FB6C0 movzx rax, al
EB02 jmp SHORT G_M52811_IG06
G_M52811_IG05: ;; offset=0039H
33C0 xor eax, eax
G_M52811_IG06: ;; offset=003BH
C5F877 vzeroupper
C3 ret
RWD00 dq 0070007400740068h, 002F002F003A0073h, 00730067006B0070h, 007600650064002Eh
RWD32 dq 0061002E00760065h, 006500720075007Ah, 006D006F0063002Eh, 0063006E0064002Fh
; Total bytes of code: 63 where for C5F5EF0D48000000 vpxor ymm1, ymm1, ymmword ptr[reloc @RWD32]
C5FDEBC1 vpor ymm0, ymm0, ymm1 we could emit: C5F5EF0D48000000 vpternlogq ymm0, ymm1, ymmword ptr [reloc @RWD32], 246 on AVX512 CPU. Reference: https://godbolt.org/z/Tx53eKxf9
|
@dotnet/avx512-contrib |
The entire list of operations can be found in software development manual here - https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html (Section 5.1) |
Hi @EgorBo, @anthonycanino and I will be working on this issue, hope we will have a draft PR shortly. |
Closed by #91227 |
E.g.
Currently emits:
where for
we could emit:
on AVX512 CPU. Same for other bitwise patterns where we can benefit from this.
Reference: https://godbolt.org/z/Tx53eKxf9
llvm-mca diff: https://www.diffchecker.com/UxW51oqr/
The text was updated successfully, but these errors were encountered: