In many places I've seen this formula quoted as an estimate for the covariance matrix $C$ in a nonlinear least squares fit: $$C=\sigma^2H^{-1}$$ where $H$ is the Hessian matrix and $\sigma$ is estimated using $\sigma^2=\frac{\boldsymbol{r}^\top\boldsymbol{r}}{m-n}$ where $\boldsymbol{r}$ are the residuals, $n$ is the number of parameters, and $m$ is the number of outputs.
However, I cannot find an explanation of where this formula comes from, particularly the $\sigma$ and its corresponding estimate. Is there any simple explanation for it, or at least a source for one?