Releases: project-asgard/kronmult
Releases · project-asgard/kronmult
v2.0
updated with hackathon improvements
-- less syncthreads
-- unrolling k
-- decided to leave m/n blocking in
v1.4.6a
try 0 indexing inside kgemm loops to reduce integer unit usage
v1.4.3
trim locals, remove unusued variables
v1.4
remove blocking over m and n to reduce integer instructions. performance increases by 20% without blocking on 6d -l 4 -d 4.