0

Here is my problem: We have $\mathbf{D} \in \Re^{m n}$, $\mathbf{W} \in \Re^{m q}$, and $\mathbf{X} \in \Re^{q n}$. Furthermore, $\mathbf{D} = \mathbf{W}\mathbf{X}$. (NOT an element wise multiplication - a normal matrix-matrix multiply).

I am trying to derive the derivative of $f(\mathbf{D})$, w.r.t $\mathbf{W}$, and the derivative of $f(\mathbf{D})$, w.r.t $\mathbf{X}$.

My class note this is taken from seems to indicate that $$ \frac{\delta \mathbf{f}}{\delta \mathbf{W}} = \frac{\delta \mathbf{f}}{\delta \mathbf{D}} \mathbf{X}^{T} \text{ and that } \frac{\delta \mathbf{f}}{\delta \mathbf{X}} = \mathbf{W}^{T} \frac{\delta \mathbf{f}}{\delta \mathbf{W}}, $$

I understand the chain rule. But I struggle to

  1. why tranpose? I struggle to see the transpose, and

  2. how come sometimes $\frac{\delta \mathbf{f}}{\delta \mathbf{W}}$ is on the left, and sometimes it is on the right. Please try to explain as clean and simple as possible. !

wrek
  • 185

0 Answers0