So I have a table that stores a bunch of traffic counts for various road segments across the country. I'm regularly getting real-time location data from vehicles that are moving around a given region and I'm trying to match the segments that correlate to the path it traveled. In order to fill in some gaps in where the vehicle has traveled, I'm using Mapbox's Matching API to snap to roads. This is good for getting a general route that the vehicle may have traveled, and is returned to us in an encoded polyline. In PostGIS I'm able to decode this polyline, use ST_Segmentize to fill in any gaps between 'snapped-to-roads' points (to avoid missing any small segments) and loop over these points to find the closest segment geometry they match to. This is fine, and it technically works, but it's really slow. I'm hoping to find a faster/better way of doing this.
Edit: In case anyone was wondering, the reason we break the segments into distances of no-greater-than .000032 is to make sure we capture even the smallest road-segments when breaking up the encoded polyline.
This is what I have so far (apologies for bad formatting):
CREATE OR REPLACE FUNCTION findASegment(polyline text)
returns table(t_index float4, t_xdsegid TEXT, t_distance DOUBLE PRECISION) as $$
DECLARE
node text;
iterator float4 := 1;
BEGIN
CREATE TEMPORARY TABLE temp_table (
t_index float4,
t_xdsegid TEXT,
t_distance DOUBLE PRECISION
) ON COMMIT DROP;
FOR node IN (SELECT ST_AsText((dp).geom) as node
FROM (SELECT (st_dumppoints(
ST_Segmentize(st_linefromencodedpolyline(polyline),
.000032))) AS dp) AS foo)
LOOP
INSERT INTO temp_table (t_index, t_xdsegid, t_distance)
select iterator as t_iterator, xdsegid, distance
FROM (SELECT bar.xdsegid, st_distance(geom, st_geomfromtext(node)) as distance from alabama as bar) as baz
ORDER BY distance ASC
LIMIT 1;
iterator := iterator + 1;
END LOOP;
RETURN query SELECT t.t_index, t.t_xdsegid, t.t_distance FROM temp_table t;
END; $$
language plpgsql;
SELECT *
FROM (SELECT distinct on (t_xdsegid) t_xdsegid, t_index
FROM findASegment('{hhkEzz|pOGQq@cC')
order by t_xdsegid, t_index) as p
ORDER BY p.t_index;
Here are some pictures as per-request:
So we initially are given two sets of lat/long coordinates. We use MapboxMatching to fill in points on a road that a vehicle could have traveled to get from point A to point B. Mapbox returns us an encoded polyline that looks like this on a map:
But, the issue with this, is that sometimes those points in the encoded polyline are too far apart. They may miss some of the smaller road-segments along that path. So we segmentize it to interpolate additional points into it.
Now it looks more like this:
Then I want to get all the possible segments it passed through from the geometry table. Each row in the table looks like this:
The reason we use distance to get the nearest segment from the table is because the snapped-to-roads geometries don't necessarily match up to the stored geometries. The geometries in the database may veer off the road slightly, making it to where contains isn't viable (to my knowledge). Hope this helps



JOIN LATERALin combination with the spatial index poweredKNNoperator<->would be the general pattern for an effektive Nearest Neighbor search. Check this answer for a working example that can easily be adopted. – geozelot Sep 19 '18 at 17:22