I am working on a project using kernel PCA with a gaussian kernel, and I am trying to understand a part of the theory. According Mercer's thereom, I know that since the RBF kernel is PDS, there exists a reproducing kernel map $\phi_x$ and associated reproducing kernel hilbert space such that $\phi_x= K(x,.)$ and $K(x,y) = \phi(x)^T \phi(y)$.
It is said that the function $\phi$ is not unique, and I understand that, for example, if we want to find the empirical kernel map $\phi(x)$ that agrees with a set of inner products (gram matrix) on a finite data set, we can solve for it explicitly with the RBF kernel. But I am confused about why this function wasn't already defined when we said that $\phi_x = K(x,.)$. Doesn't that already give an definition of $\phi_x$ as $\phi_x(y) = K(x,y)$?
I guess my question boils down to: what is the relationship between the reproducing kernel map, $\phi_x(y)$, which is a function, and the feature representation of x, $\phi(x)$?
Any help on this would be greatly appreciated!