6

Here's my understanding of the kernel trick. The motivation is to find a linear separator in a higher dimensional space than what you have (because the data are not currently linearly separable.) You take the dot product, and then apply the transformation to the result, saving you the time of applying the transformation to each of the pieces of data going into the dot product.

Is this a decent summary or am I missing something?

Thanks

2 Answers2

3

There are plenty of references. For example here or less technical but still nicely explained. It is the other way around, problems that can be expressed in terms of dot products are amenable to be "kernelized", i.e. one can apply the kernel trick yielding to better versions of the algorithms.

A really nice paper reviewing applications of kernelization is this one. Happy reading

jpmuc
  • 13,964
3

Yes. Simply put, $k(x,y) = <\Phi(x), \Phi(y)>$ is the kernel trick. Inner product in the feature space is the evaluation of kernel in the input space.

Memming
  • 1,632