Type issue in gkmx() #3

ChenhanYu · 2016-11-03T22:06:32Z

/frame/gkmx.hpp

Notice that currently gkmx<...,TA,TB,TC,TV> has the following issues.

TV cannot be different from TC.
Type rules in macro kernels may be wrong.

According to the definition, gkmx only need to pass in an m-by-n of C in type TC, but when k > KC the temporary rank-KC update must be stored as an m-by-NC matrix as type TV. It it very unpleasant to allocate this temporary buffer, but currently I have not find a way to resolve this. GKRM will have the same issue later.

Maybe we can increase KC such that k will never be larger than KC when TC != TV detected.

Notice that gkmx_gpu.hpp does not have this problem. GEMM algorithm on GPU does not store rank-KC update back to the global memory. L1 cache on GPU can be manually controlled; thus, storing back in unnecessary.

ChenhanYu added the bug label Nov 3, 2016

ChenhanYu self-assigned this Nov 3, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Type issue in gkmx() #3

Type issue in gkmx() #3

ChenhanYu commented Nov 3, 2016

Type issue in gkmx() #3

Type issue in gkmx() #3

Comments

ChenhanYu commented Nov 3, 2016