7

I've just (to my embarrassment) encountered a BLAS-like extension of a matrix-matrix product subroutine gemm in Intel MKL: gemm3m. This subroutine (particular versions: cgemm3m and zgemm3m) allows performing matrix-matrix multiplication for complex-valued matrices using fewer arithmetic operations.

The gemm3m documentation claims that it

...reduces the time spent in matrix operations by 25%, resulting in significant savings in compute time for large matrices.

Looking at the provided error analysis in the Application Notes, I don't see anything "criminal": $$ \hat{C}=\text{fl}(C_1+iC_2)=\text{fl}\big((A_1+iA_2)(B_1+iB_2)\big)=\hat{C}_1+i\hat{C}_2 $$ $$ ||\hat{C}_1-C_1||\leq 2(n+1)u||A||_\infty||B||_\infty+\mathcal O(u^2)\\ ||\hat{C}_2-C_2||\leq 4(n+4)u||A||_\infty||B||_\infty+\mathcal O(u^2) $$ where $A,B,C\in\mathbb C^{n\times n}$ are complex matrices, $A_{1,2},B_{1,2},C_{1,2}\in\mathbb R^{n\times n}$ are their real and imaginary parts, respectively, $i=\sqrt{-1}$; $\hat{C}\in\mathbb C^{n\times n}$ and $\hat{C}_{1,2}\in\mathbb R^{n\times n}$ are the result of floating-point operations on $A$ and $B$ accoring to the gemm3m matrix-matrix multiplication algorithm. $|u|<\epsilon_\text{mach}$ if the floating-point arithmetic is IEEE-754 and no underflow\overflow happens.

So, is there any catch on using zgemm3m vs regular zgemm? Is there a situation where I should avoid using zgemm3m?

Anton Menshov
  • 8,672
  • 7
  • 38
  • 94

0 Answers0