According to Bishop, the author from "Statistical Pattern Recognition", we can optimize the hyperparameters of a Gaussian process by maximizing the likelihood function
$$p(\textbf{t}|\theta),$$ where $\textbf{t}$ denotes the target vectors $(t_1, .. ,t_N)$ of the corresponding input values $x_1, ..., x_N$ and $\theta$ the hyperparameters.
He then claims, that the log likelihood function is given by the standard form for a multivariate Gaussian distribution:
$$\ln{p(\textbf{t}|\theta}) = -\frac{1}{2}\ln{|C_N|}-\frac{1}{2} \textbf{t}^T C_N^{-1}\textbf{t}-\frac{N}{2}\ln{2\pi}$$
Considering the multivariate Gaussian distribution for $X = [X_1, ...., X_N]^T$ is given by: (https://cs229.stanford.edu/section/gaussians.pdf)
$$p(x|\mu,\Sigma) = \frac{1}{(2\pi)^{\frac{n}{2}}|\Sigma|^{\frac{1}{2}} }\exp\left(-\frac{1}{2}(x-\mu)\Sigma^{-1}(x-\mu)\right)$$
To me, the first equation is derived by simply the $\log$ of the second equation (keep in mind that the mean is zero in the Gaussian Process described by Bishop) without taking the product into account, which would be $\prod_N{\ln{p(\textbf{t}|\theta})}$. I am not sure, what I am missing here. As far as I know, taking the log of a gaussian distribution is not enough to maximize a wanted parameter.
To be more precise, would the MLE of equation 2 not be:
$$\prod_K{p(x|\mu,\Sigma) = \frac{1}{(2\pi)^{\frac{kn}{2}}|\Sigma|^{\frac{k}{2}} }\exp\left(-\frac{1}{2}\sum_K(x-\mu)\Sigma^{-1}(x-\mu)\right)}$$
which is not equal to the first equation, after taking the log. Notice the $k$ in the denumerator and the sum of the term in the exp term. Thus, I would expect something like:
$$\ln{p(\textbf{t}|\theta}) = -\frac{K}{2}\ln{|C_N|}-\frac{1}{2} \sum \textbf{t}^T C_N^{-1}\textbf{t}-\frac{KN}{2}\ln{2\pi}$$
This assumption is further verified by the accepted answer in this post: Maximum Likelihood Estimators - Multivariate Gaussian
For me, the first equation is just the log of a gaussian multivariate normal distribution with zero mean, not the MLE.