1

I'm looking to select the distance from a given location to the nearest location in a joining table, and order by it. In my use case I have a listings table which is joined to locations table through listing_location. The locations table has two columns for coordinate - latitude and longitude (both doubles).

SELECT 
  listings.*
  ST_Distance(ST_Point(?, ?), ST_ClosestPoint(:unsure:, ST_Point(?, ?)), FALSE) AS distance)
JOIN listing_location
  ON listing_location.listing_id = listings.id
JOIN locations
  ON locations.id = listing_location.location_id
ORDER BY distance ASC

The question marks would be bound to the longitude and latitude of the given location, but I'm unsure what to pass into :unsure: in order to select the joined locations. Am I missing something, or thinking about this problem in the wrong way? Is this problem solvable when my location data is stored as latitude/longitude double columns?

Dwight
  • 113
  • 4

1 Answers1

2

You want to utilize the <-> operator for index driven (K) Nearest Neighbor searches; this core PostgreSQL operator is extended in PostGIS to specifically use a (spatial) GIST index when in ORDER BY:

SELECT ls.*,
       ST_Distance(ll.geom, ST_SetSRID(ST_MakePoint(?, ?), <SRID>)) AS dist
FROM   listings AS ls
JOIN   listing_location AS ll_r
  ON   ls.id = ll_r.listing_id
JOIN   locations AS ll
  ON   ll.id = ll_r.location_id
ORDER BY
       ST_SetSRID(ST_MakePoint(ll.longitude, ll.latitude), <SRID>) <-> ST_SetSRID(ST_MakePoint(?, ?), <SRID>)
-- LIMIT <X>
;

Notes:

  • to actually use and index, you need to have one in place; you have two options here

    • add a GEOMETRY column (e.g. named geom), which is the intended behavior in PostGIS
      ALTER TABLE locations
        ADD COLUMN geom GEOMETRY(<TYPE>, <SRID>)
      ;
      

      UPDATE locations SET geom = ST_SetSRID(ST_MakePoint(longitude, latitude), <SRID>) ;

      and
      CREATE INDEX ON locations
        USING GIST (geom)
      ;
      
    • create a functional index on the creation expression (yes, double brackets)
      CREATE INDEX ON locations
        USING GIST ( (ST_SetSRID(ST_MakePoint(ll.longitude, ll.latitude), <SRID>)) )
      ;
      
  • work correctly with a geographic <SRID>; consider that

    • longitudinal surface distance between two pairs of longitude/latitude is a function of the cosines of their latitudes, meaning that one degree of longitude has a different surface length at different latitudes
    • the above also implies that degree based distances are useless without the latitudinal context
    • PostGIS' GEOMETRY type treats geographic coordinates as in the Cartesian plane rather than on a sphere/spheroid, and returns the base unit of the given CRS from ST_Distance (which is degree for geographical reference systems)

    You can now either

    • project your coordinates to a suitable planar reference (projection), or
    • use the GEOGRAPHY type, which is designed to specifically tackle these issues and provide an easy interface for highly precise metrics on the geoidal surface - at the slight cost of performance due to the heavier calculations, and a smaller set of functions supporting it.

    With this in mind, and favouring the GEOGRAPHY type over projections, your query and index creation would look like (note the cast (::) to GEOGRAPHY):

    CREATE INDEX ON locations
      USING GIST ( (ST_SetSRID(ST_MakePoint(ll.longitude, ll.latitude), <SRID>)::GEOGRAPHY) )
    ;
    

    --

    SELECT ls.*, ST_Distance(ll.geom::GEOGRAPHY, ST_SetSRID(ST_MakePoint(?, ?), <SRID>)::GEOGRAPHY) AS dist FROM listings AS ls JOIN listing_location AS ll_r ON ls.id = ll_r.listing_id JOIN locations AS ll ON ll.id = ll_r.location_id ORDER BY ST_SetSRID(ST_MakePoint(ll.longitude, ll.latitude), <SRID>)::GEOGRAPHY <-> ST_SetSRID(ST_MakePoint(?, ?), <SRID>)::GEOGRAPHY -- LIMIT <X> ;


    You can also decide to

    • add a GEOMETRY column as above
    • put a functional index in place using the cast to GEOGRAPHY
      CREATE INDEX ON locations
        USING GIST ( (geom::GEOGRAPHY) )
      ;
      
    • and again cast the geometry column to GEOGRAPHY in your queries on-the-fly

    This has the advantage in the long run, since most client applications have a hard time working with GEOGRAPHY.


Extra Note:

After large updates, including adding columns and indexes, make sure you run

VACUUM ANALYZE <table>;

to update the statistics for the query planner!


Related:

geozelot
  • 30,050
  • 4
  • 32
  • 56
  • Wow, this is an amazing answer - thank you so much for going into so much detail here! I really appreciate the explanation behind indexing and using the proper geometry types. – Dwight Mar 17 '21 at 11:10