Some disadvantages of LOWESS vs GP are the following:
- LOWESS will fail even for a moderate input dimension of data, while GP works in this case for right kernel selected
- For LOWESS to work we need a dense training sample of points over the whole design space. For GP the requirements are less strict
- We need additional tricks to control outliers
Some advantages of GP vs LOWESS are the following:
- Both prediction and uncertainty estimate have analytical forms: at each point $x$ we have a posterior distribution given the data $\mathcal{N}(\mu(x), \sigma^2(x))$ with $\sigma^2(x)$ has analytical form and is natural uncertainty estimate
- GP controls over the smoothness of the model by the selection of a proper kernel
- Straightforward estimation of parameters: we write likelihood for the data and maximize it with respect to parameters of the kernel
- GP can be treated as a special problem for the selection of a function from RKHS space
- Theoretical justification: it is a solution of Kolmogorov-Wiener equations, so we minimize L2 error over all possible estimators under the assumption about the Gaussian joint data distribution
- An interpolation without giving up the smoothness of the model, so the predictions $\hat{y}(x)$ equal to the true values $y(x)$.
IMO the most important points are limitation of LOWESS to moderate dimensions and absence of analytical and principled global prediction and uncertainty estimate. Also, I believe, that in practice GP just works better, than most of the kernel methods including LOWESS.