3

Is there a fast nearest neighbor search algorithm that generates the nearest neighbors, not based on Euclidean distances but based on geographic distances over a set of latitudes/longitudes. The fast nearest neighbor search based on Euclidean distances though is based on spatial indexing using R-Trees, K-D Trees etc. What about for Lat/Long distances?

hearse
  • 259
  • 2
  • 11

1 Answers1

3

On a sphere, you can use the nearest neighbor list from the Euclidean distance to get the correct points ordered by distance, because the Euclidean distance is less or equal to the geodesic distance. Once you have that list of points you can simply calculate the geodesic distance for the point of interest. So the fast k-nearest neighbor algorithms based on the Euclidean metric can be used.

Bort
  • 1,285
  • 7
  • 12
  • Is there any academic reference that supports this. Say, I had a dense dataset with lat long's collected to a floating point accuracy of only two decimals, am not sure this statement would be definitive then, as the locality and neighborhood order relations would change, I guess. – hearse Jun 12 '13 at 18:55
  • I don't have a reference at hand, however for a sphere this should be obvious. The Euclidean distance represents the chordal distance between two points a,b. This line segment connecting a,b in the embedding space does not lay in S2 except for the end points and this line segment connects both points from within the sphere. The geodesic distance represents the arc length of the shortest path connecting both points a,b lays entirely in S2. Hence, the Euclidean distance is always smaller or equal to the geodesic distance. It shouldn't be hard to prove rigorously with a bit differential geometry. – Bort Jun 13 '13 at 19:46
  • Am sure it is smaller being the straight line distance. My issue is with them not being isometric w.r.t each other when transformed. For example, a linear projection would distort the neighborhood relations, unless the transformation was non-linear from one metric to the other metric space. – hearse Jun 13 '13 at 20:06
  • I am not sure if I get you with linear projection. Can expand on what you intend to do? For kNN, you use the embedding space and the euclidean metric of that one. You obtain the order in which all points are closest to your point of interest. The actual geodesic distances are different but their relative ordering with respect to their real distance to the reference point is right and that is what matters for nearest neighbor search. – Bort Jun 13 '13 at 20:21
  • What makes you say definitively that the ordering is right? I have run a simple experiment with a collection of lats longs with 2 decimal places and the geodesic and Euclidean nearest neighbors are not the same, exactly-nor is the ordering. – hearse Jun 13 '13 at 20:29
  • Can you show how you calculate the distance with lattiude and longitude? It should be like this. – Bort Jun 13 '13 at 20:44
  • There are multiple approximations as well. I used a variant of the haversine approximation and the ordering is not preserved perfectly. – hearse Jun 13 '13 at 22:26