Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Metal accuracy problem caused by <dtype>3 vectors usage #7830

Merged
merged 1 commit into from
Apr 13, 2021

Conversation

elvin-n
Copy link
Contributor

@elvin-n elvin-n commented Apr 12, 2021

On example of float3 datatype:
Using of float3 data type for loading of data cuncurrently into dense array shared
between all threads in Metal threading group can lead to data race between threads.
float3 datatype has size and and alignment eq to 16 bytes while kernel assumes to
copy 12 bytes in arbitrary not aligned places.
Using of packed_float3 datatypes solves the issue

@jwfromm
Copy link
Contributor

jwfromm commented Apr 12, 2021

@tqchen do you have any ideas on tests we could add to exercise this change or would it be ok to merge as is?

@tqchen
Copy link
Member

tqchen commented Apr 12, 2021

We can merge this in as it is as long as there is a separate reviews say from @echuraev

Copy link
Contributor

@echuraev echuraev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

On example of float3 datatype:
Using of float3 data type for loading of data cuncurrently into dense array shared
between all threads in Metal threading group can lead to data race between threads.
float3 datatype has size and and alignment eq to 16 bytes while kernel assumes to
copy 12 bytes in arbitrary not aligned places.
Using of packed_float3 datatypes solves the issue
@tqchen tqchen merged commit eb00a03 into apache:main Apr 13, 2021
@tqchen
Copy link
Member

tqchen commented Apr 13, 2021

Thanks @elvin-nnov @echuraev !

echuraev pushed a commit to Deelvin/tvm that referenced this pull request Apr 23, 2021
)

On example of float3 datatype:
Using of float3 data type for loading of data cuncurrently into dense array shared
between all threads in Metal threading group can lead to data race between threads.
float3 datatype has size and and alignment eq to 16 bytes while kernel assumes to
copy 12 bytes in arbitrary not aligned places.
Using of packed_float3 datatypes solves the issue
trevor-m pushed a commit to trevor-m/tvm that referenced this pull request May 6, 2021
)

On example of float3 datatype:
Using of float3 data type for loading of data cuncurrently into dense array shared
between all threads in Metal threading group can lead to data race between threads.
float3 datatype has size and and alignment eq to 16 bytes while kernel assumes to
copy 12 bytes in arbitrary not aligned places.
Using of packed_float3 datatypes solves the issue
trevor-m pushed a commit to trevor-m/tvm that referenced this pull request May 6, 2021
)

On example of float3 datatype:
Using of float3 data type for loading of data cuncurrently into dense array shared
between all threads in Metal threading group can lead to data race between threads.
float3 datatype has size and and alignment eq to 16 bytes while kernel assumes to
copy 12 bytes in arbitrary not aligned places.
Using of packed_float3 datatypes solves the issue
trevor-m pushed a commit to trevor-m/tvm that referenced this pull request May 6, 2021
)

On example of float3 datatype:
Using of float3 data type for loading of data cuncurrently into dense array shared
between all threads in Metal threading group can lead to data race between threads.
float3 datatype has size and and alignment eq to 16 bytes while kernel assumes to
copy 12 bytes in arbitrary not aligned places.
Using of packed_float3 datatypes solves the issue
trevor-m pushed a commit to neo-ai/tvm that referenced this pull request May 11, 2021
)

On example of float3 datatype:
Using of float3 data type for loading of data cuncurrently into dense array shared
between all threads in Metal threading group can lead to data race between threads.
float3 datatype has size and and alignment eq to 16 bytes while kernel assumes to
copy 12 bytes in arbitrary not aligned places.
Using of packed_float3 datatypes solves the issue
@dlexplorer dlexplorer deleted the amalyshe/fix_metal_accuracy branch June 3, 2021 09:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants