4

Using too low a value of K gives over fitting.

But how is overfitting prevented:

  • How do we make sure K is not too low
  • And are there any other precautions taken in k-nn that help prevent over fitting.

1 Answers1

1

This relates to the number of samples that you have and the noise on these samples.

For instance if you have two billion samples and if you use $k=2$, you could have overfitting very easily, even without lots of noise.

If you have noise, then you need to increase the number of neighbors so that you can use a region big enough to have a safe decision.

But for a ballpark estimate, I would start with $k=log(nb samples)$, and I would increase $k$ depending on the level of noise in my samples.

  • It seems like this answer has a number of unstated premises that it relies on. How does the number of samples $N$, number of features and noise level relate to the concept of overfitting? Are there any theorems about $k$-NN and these concepts which you can use to explain why $\ln(N)$ is a good starting point? – Sycorax May 14 '22 at 18:24
  • At least no theorem last time I checked a few years ago. It's just a heuristic that worked well for me and on different datasets. – Matthieu Brucher May 15 '22 at 19:19