-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Metal accuracy problem caused by <dtype>3 vectors usage #7830
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@tqchen do you have any ideas on tests we could add to exercise this change or would it be ok to merge as is? |
We can merge this in as it is as long as there is a separate reviews say from @echuraev |
echuraev
approved these changes
Apr 13, 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
On example of float3 datatype: Using of float3 data type for loading of data cuncurrently into dense array shared between all threads in Metal threading group can lead to data race between threads. float3 datatype has size and and alignment eq to 16 bytes while kernel assumes to copy 12 bytes in arbitrary not aligned places. Using of packed_float3 datatypes solves the issue
dlexplorer
force-pushed
the
amalyshe/fix_metal_accuracy
branch
from
April 13, 2021 05:53
491d64d
to
b483dda
Compare
tqchen
approved these changes
Apr 13, 2021
Thanks @elvin-nnov @echuraev ! |
echuraev
pushed a commit
to Deelvin/tvm
that referenced
this pull request
Apr 23, 2021
) On example of float3 datatype: Using of float3 data type for loading of data cuncurrently into dense array shared between all threads in Metal threading group can lead to data race between threads. float3 datatype has size and and alignment eq to 16 bytes while kernel assumes to copy 12 bytes in arbitrary not aligned places. Using of packed_float3 datatypes solves the issue
trevor-m
pushed a commit
to trevor-m/tvm
that referenced
this pull request
May 6, 2021
) On example of float3 datatype: Using of float3 data type for loading of data cuncurrently into dense array shared between all threads in Metal threading group can lead to data race between threads. float3 datatype has size and and alignment eq to 16 bytes while kernel assumes to copy 12 bytes in arbitrary not aligned places. Using of packed_float3 datatypes solves the issue
trevor-m
pushed a commit
to trevor-m/tvm
that referenced
this pull request
May 6, 2021
) On example of float3 datatype: Using of float3 data type for loading of data cuncurrently into dense array shared between all threads in Metal threading group can lead to data race between threads. float3 datatype has size and and alignment eq to 16 bytes while kernel assumes to copy 12 bytes in arbitrary not aligned places. Using of packed_float3 datatypes solves the issue
trevor-m
pushed a commit
to trevor-m/tvm
that referenced
this pull request
May 6, 2021
) On example of float3 datatype: Using of float3 data type for loading of data cuncurrently into dense array shared between all threads in Metal threading group can lead to data race between threads. float3 datatype has size and and alignment eq to 16 bytes while kernel assumes to copy 12 bytes in arbitrary not aligned places. Using of packed_float3 datatypes solves the issue
trevor-m
pushed a commit
to neo-ai/tvm
that referenced
this pull request
May 11, 2021
) On example of float3 datatype: Using of float3 data type for loading of data cuncurrently into dense array shared between all threads in Metal threading group can lead to data race between threads. float3 datatype has size and and alignment eq to 16 bytes while kernel assumes to copy 12 bytes in arbitrary not aligned places. Using of packed_float3 datatypes solves the issue
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
On example of float3 datatype:
Using of float3 data type for loading of data cuncurrently into dense array shared
between all threads in Metal threading group can lead to data race between threads.
float3 datatype has size and and alignment eq to 16 bytes while kernel assumes to
copy 12 bytes in arbitrary not aligned places.
Using of packed_float3 datatypes solves the issue