Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: VectorNNN.Create with broadcasting #92299

Closed
EgorBo opened this issue Sep 19, 2023 · 4 comments · Fixed by #103462
Closed

[API Proposal]: VectorNNN.Create with broadcasting #92299

EgorBo opened this issue Sep 19, 2023 · 4 comments · Fixed by #103462
Labels
api-approved API was approved in API review, it can be implemented area-System.Runtime.Intrinsics in-pr There is an active PR which will close this issue when it is merged
Milestone

Comments

@EgorBo
Copy link
Member

EgorBo commented Sep 19, 2023

Background and motivation

Constant vectors quite often have the same values in 128-bit lanes, it makes them especially verbose with AVX512, consider these: https://github.com/dotnet/runtime/blob/d29d4d04d20252c283b76fa50f04b6ebf5dc9d91/src/libraries/System.Private.CoreLib/src/System/Buffers/Text/Base64Decoder.cs#L665-L681 it'd be nice to have a Create API that can automatically broadcasts lanes to keep code smaller (and data section's size too - but that is irrelevant here)

API Proposal

namespace System.Runtime.Intrinsics
{
    public static partial class Vector128
    {
        public static Vector128<T> Create<T>(Vector64<T> value) where T : struct;
    }

    public static partial class Vector256
    {
        public static Vector256<T> Create<T>(Vector128<T> value) where T : struct;
    }

    public static partial class Vector512
    {
        public static Vector512<T> Create<T>(Vector128<T> value) where T : struct;
        public static Vector512<T> Create<T>(Vector256<T> value) where T : struct;
    }
}

API Usage

Vector512<sbyte> lutShift = Vector512.Create(
    0, 16, 19, 4, -65, -65, -71, -71, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 16, 19, 4, -65, -65, -71, -71, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 16, 19, 4, -65, -65, -71, -71, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 16, 19, 4, -65, -65, -71, -71, 0, 0, 0, 0, 0, 0, 0, 0);

// becomes:

Vector512<sbyte> lutShift = Vector512.Create(
    Vector128.Create(0, 16, 19, 4, -65, -65, -71, -71, 0, 0, 0, 0, 0, 0, 0, 0));

Alternative Designs

No strong opinion on Vector64

Risks

No response

@EgorBo EgorBo added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Sep 19, 2023
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Sep 19, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Sep 19, 2023
@EgorBo
Copy link
Member Author

EgorBo commented Sep 19, 2023

cc @tannergooding @dotnet/avx512-contrib

@MihaZupan MihaZupan added area-System.Runtime.Intrinsics and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Sep 19, 2023
@ghost
Copy link

ghost commented Sep 19, 2023

Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

Constant vectors quite often have the same values in 128-bit lanes, it makes them especially verbose with AVX512, consider these: https://github.com/dotnet/runtime/blob/d29d4d04d20252c283b76fa50f04b6ebf5dc9d91/src/libraries/System.Private.CoreLib/src/System/Buffers/Text/Base64Decoder.cs#L665-L681 it'd be nice to have a Create API that can automatically broadcasts lanes to keep code smaller (and data section's size too - but that is irrelevant here)

API Proposal

namespace System.Runtime.Intrinsics
{
    public static partial class Vector256
    {
        public static Vector256<T> Create<T>(Vector128<T> valueToBroadcast) where T : struct;
    }

    public static partial class Vector512
    {
        public static Vector512<T> Create<T>(Vector128<T> valueToBroadcast) where T : struct;
        public static Vector512<T> Create<T>(Vector256<T> valueToBroadcast) where T : struct;
    }
}

API Usage

Vector512<sbyte> lutShift = Vector512.Create(
    0, 16, 19, 4, -65, -65, -71, -71, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 16, 19, 4, -65, -65, -71, -71, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 16, 19, 4, -65, -65, -71, -71, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 16, 19, 4, -65, -65, -71, -71, 0, 0, 0, 0, 0, 0, 0, 0);

// becomes:

Vector512<sbyte> lutShift = Vector512.Create(
    Vector128.Create(0, 16, 19, 4, -65, -65, -71, -71, 0, 0, 0, 0, 0, 0, 0, 0));

Alternative Designs

No strong opinion on Vector64

Risks

No response

Author: EgorBo
Assignees: -
Labels:

api-suggestion, area-System.Runtime.Intrinsics, untriaged

Milestone: -

@tannergooding tannergooding added api-ready-for-review API is ready for review, it is NOT ready for implementation and removed api-suggestion Early API idea and discussion, it is NOT ready for implementation untriaged New issue has not been triaged by the area owner labels Sep 19, 2023
@tannergooding tannergooding added this to the 9.0.0 milestone Sep 19, 2023
@tannergooding
Copy link
Member

We should include Vector64 for consistency and to ensure it works well on Arm64 where it is supported.

The parameter name should likely just be value for consistency as well. The summary/parameter description helps clarify it is being broadcast as part of the create operation.

@bartonjs
Copy link
Member

bartonjs commented Sep 21, 2023

Video

  • We went ahead and squared the circle, adding the Vector64 overloads.
namespace System.Runtime.Intrinsics
{
    public static partial class Vector128
    {
        public static Vector128<T> Create<T>(Vector64<T> value) where T : struct;
    }

    public static partial class Vector256
    {
        public static Vector256<T> Create<T>(Vector64<T> value) where T : struct;
        public static Vector256<T> Create<T>(Vector128<T> value) where T : struct;
    }

    public static partial class Vector512
    {
        public static Vector512<T> Create<T>(Vector64<T> value) where T : struct;
        public static Vector512<T> Create<T>(Vector128<T> value) where T : struct;
        public static Vector512<T> Create<T>(Vector256<T> value) where T : struct;
    }
}

@bartonjs bartonjs added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Sep 21, 2023
@EgorBo EgorBo self-assigned this Sep 21, 2023
@EgorBo EgorBo removed their assignment Oct 21, 2023
@dotnet-policy-service dotnet-policy-service bot added the in-pr There is an active PR which will close this issue when it is merged label Jun 14, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Jul 16, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Runtime.Intrinsics in-pr There is an active PR which will close this issue when it is merged
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants