(no title)
kolbusa | 3 years ago
> Let's simplify the problem and implicitly transpose the matrix multiplication. Both A and B (our inputs) will have K (our reduction dimension) as the leading dimension. This doesn't really matter much in practice, but it simplifies our code a lot.
The code is
C[n * 16 + m] += A[k * 16 + m] * B[k * 16 + n];
Which means that actually *m* is the leading dimension of A with stride 16, and for B it is *n* with stride 16.
No comments yet.