Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix Metal accuracy problem caused by <dtype>3 vectors usage
On example of float3 datatype: Using of float3 data type for loading of data cuncurrently into dense array shared between all threads in Metal threading group can lead to data race between threads. float3 datatype has size and and alignment eq to 16 bytes while kernel assumes to copy 12 bytes in arbitrary not aligned places. Using of packed_float3 datatypes solves the issue
- Loading branch information