When performing k-nearest neighbor analysis on a large dataset, using a kd-tree algorithm can greatly speed up the search. I've tried researching this question, but have not found an answer - can the use of a kd-tree introduce bias into the nearest neighbor search? Does the splitting that kd-tree performs mean that certain groups get excluded from being nearest neighbor?
Follow up questions: If there is potential to introduce bias, how can you test whether it is actually occurring?
Are there other algorithms that are better than kd-tree for large datasets?
Many thanks