Finish Avx512 specific lightup for Vector128/256/512<T> #85207

tannergooding · 2023-04-23T00:10:33Z

With #80814, we achieved functional parity of Vector512<T> with Vector128<T> and Vector256<T>. However, there are some new instructions available in Avx512 capable hardware that will allow additional hardware acceleration opportunities for all three types.

This includes:

ConvertToDouble() - vcvtqq2pd & vcvtuqq2pd
ConvertToInt64() - vcvtpd2qq
ConvertToUInt32() - vcvtps2udq
ConvertToUInt64() - vcvtpd2uqq
ConditionalSelect() - vpternlog
Shuffle() - vpermi2*, vpermt2*, etc

We should also ensure that all APIs are accelerated as intrinsic, where applicable, in particular the following are still managed fallbacks (but accelerated):

Vector512.Dot()
Vector512.Sum()

There may be others as well, so a general audit to validate would be good.

The text was updated successfully, but these errors were encountered:

ghost · 2023-04-23T00:10:42Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

With #80814, we achieved functional parity of Vector512<T> with Vector128<T> and Vector256<T>. However, there are some new instructions available in Avx512 capable hardware that will allow additional hardware acceleration opportunities for all three types.

This includes:

ConvertToDouble() - vcvtqq2pd & vcvtuqq2pd
ConvertToInt64() - vcvtpd2qq
ConvertToUInt32() - vcvtps2udq
ConvertToUInt64() - vcvtpd2uqq
ConditionalSelect() - vpternlog
Shuffle() - vpermi2*, vpermt2*, etc

We should also ensure that all APIs are accelerated as intrinsic, where applicable, in particular the following are still managed fallbacks (but accelerated):

Vector512.Dot()
Vector512.Sum()

There may be others as well, so a general audit to validate would be good.

Author:	tannergooding
Assignees:	-
Labels:	`area-CodeGen-coreclr`, `arch-avx512`
Milestone:	8.0.0

DeepakRajendrakumaran · 2023-04-24T18:01:57Z

@tannergooding Sum is a weird one. I don't believe any single avx512 instruction exists for it

Documentation:
https://www.intel.com/content/www/us/en/docs/cpp-compiler/developer-guide-reference/2021-8/intrinsics-for-integer-reduction-operations.html

Clang implementation:
https://godbolt.org/z/hW1Thxhe3

JulieLeeMSFT · 2024-04-18T21:55:07Z

Most of this is being handled in #100993.

tannergooding · 2024-08-01T00:29:06Z

Vector512.Dot is the only one left here and can be done in .NET 10, the current implementation is functionality correct and generating the expected codegen it just isn't handled directly as an intrinsic in the JIT.

tannergooding added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI avx512 Related to the AVX-512 architecture labels Apr 23, 2023

tannergooding added this to the 8.0.0 milestone Apr 23, 2023

BruceForstall mentioned this issue Apr 24, 2023

Implement AVX-512 support #77034

Closed

56 tasks

JulieLeeMSFT assigned tannergooding May 18, 2023

BruceForstall mentioned this issue Jun 22, 2023

Accelerating Vector512.Sum() #87851

Closed

BruceForstall modified the milestones: 8.0.0, 9.0.0 Jul 19, 2023

martinothamar mentioned this issue Jul 21, 2023

Use AVX512 intrinsics for number conversion when possible #89330

Closed

BruceForstall mentioned this issue Oct 9, 2023

Intel architecture improvements for .NET 9 #93196

Closed

33 tasks

tannergooding modified the milestones: 9.0.0, 10.0.0 Aug 1, 2024

BruceForstall mentioned this issue Oct 15, 2024

Intel architecture improvements for .NET 10 #108869

Open

24 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finish Avx512 specific lightup for Vector128/256/512<T> #85207

Finish Avx512 specific lightup for Vector128/256/512<T> #85207

tannergooding commented Apr 23, 2023 •

edited by JulieLeeMSFT

Loading

ghost commented Apr 23, 2023

DeepakRajendrakumaran commented Apr 24, 2023

JulieLeeMSFT commented Apr 18, 2024

tannergooding commented Aug 1, 2024

Finish Avx512 specific lightup for Vector128/256/512<T> #85207

Finish Avx512 specific lightup for Vector128/256/512<T> #85207

Comments

tannergooding commented Apr 23, 2023 • edited by JulieLeeMSFT Loading

ghost commented Apr 23, 2023

DeepakRajendrakumaran commented Apr 24, 2023

JulieLeeMSFT commented Apr 18, 2024

tannergooding commented Aug 1, 2024

tannergooding commented Apr 23, 2023 •

edited by JulieLeeMSFT

Loading