R - Identify on which street a point is located based on geographic coordinates

Question

Given a tibble s2map that contains street names and the geographical coordinates of points located on these streets,

head(s2map)
# A tibble: 6 × 13
# Groups:   Street [1]
  Street                 WayID Idx   NodeId highway is_in…¹ railway addr.…² is_in publi…³   lat   lon dista…⁴
  <chr>                  <chr> <chr> <chr>  <chr>   <chr>   <chr>   <chr>   <chr> <chr>   <dbl> <dbl>   <dbl>
1 Aleea Alexandru Tziga… 8421… 1     97961… reside… 0       0       0       0     0        44.4  26.1   NA   
2 Aleea Alexandru Tziga… 8421… 2     44714… reside… 0       0       0       0     0        44.4  26.1   34.7 
3 Aleea Alexandru Tziga… 8421… 3     44714… reside… 0       0       0       0     0        44.4  26.1    4.45
4 Aleea Alexandru Tziga… 8421… 4     44714… reside… 0       0       0       0     0        44.4  26.1    4.50
5 Aleea Alexandru Tziga… 8421… 5     97961… reside… 0       0       0       0     0        44.4  26.1    9.39
6 Aleea Alexandru Tziga… 8421… 6     97961… reside… 0       0       0       0     0        44.4  26.1   11.6

together with a data.table GPS_1 that contains a list of geographical coordinates

head(GPS_1)
         Date     Time      Lat      Lon Speed(km/h)
1: 2020-08-01 00:00:05 44.47336 26.14126          49
2: 2020-08-01 00:00:31 44.47613 26.13992          31
3: 2020-08-01 00:00:36 44.47630 26.13971          14
4: 2020-08-01 00:00:38 44.47629 26.13962          13
5: 2020-08-01 00:00:45 44.47623 26.13942           8
6: 2020-08-01 00:00:49 44.47628 26.13934           8

I want to determine on which street is located each point in GPS_1. I have tried several things, given the examples found here. However, applying st_intersects returned no intersection. I thought the reason was the fact that I had two lists of points - so I tried creating sf_linestrings for each of the streets based on min and max coordinates. Here is the code I tried:

# create summarized obj for s2map
 aux =  s2map %>%
          group_by(Street) %>%
          summarise(lat_min = min(lat), lat_max=max(lat),
          lon_min = min(lon), lon_max = max(lon),
          total_distance = sum(distance[!is.na(distance)]))
Create list of simple feature geometries (linestrings)
begin.coord = data.frame(lon = aux$lon_min, lat = aux$lat_min)
 end.coord = data.frame(lon= aux$lon_max, lat = aux$lat_max)
Create list of simple feature geometries (linestrings)
l_sf <- vector("list", nrow(begin.coord))
 for (i in seq_along(l_sf)){
   l_sf[[i]] <- st_linestring(as.matrix(rbind(begin.coord[i, ], end.coord[i,])))
 }
Create simple feature geometry list column
l_sfc <- st_sfc(l_sf, crs = "+proj=longlat +datum=WGS84")
Convert to sp object if needed
lines_sp <- as(l_sfc, "Spatial")
- create a sf object from the sfc list of linestrings
lines_sf = st_sf(id = aux$Street, geometry = l_sfc)
pnts <- st_as_sf(GPS_1, coords=c("Lon","Lat"), crs=4326)
 pnts_sf <- st_as_sf(pnts, coords=c("Lon","Lat"), crs=4326)
 pnts_sf %>% mutate(
   intersection = as.integer(st_intersects(geometry, lines_sf))
   , area = if_else(is.na(intersection), '', lines_sf$id[intersection])
 )

and the result:

Simple feature collection with 1236 features and 5 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 26.11976 ymin: 44.43174 xmax: 26.18438 ymax: 44.47812
Geodetic CRS:  WGS 84
First 10 features:
         Date     Time Speed(km/h)                  geometry intersection area
1  2020-08-01 00:00:05          49 POINT (26.14126 44.47336)           NA     
2  2020-08-01 00:00:31          31 POINT (26.13992 44.47613)           NA     
3  2020-08-01 00:00:36          14  POINT (26.13971 44.4763)           NA     
4  2020-08-01 00:00:38          13 POINT (26.13962 44.47629)           NA     
5  2020-08-01 00:00:45           8 POINT (26.13942 44.47623)           NA     
6  2020-08-01 00:00:49           8 POINT (26.13934 44.47628)           NA     
7  2020-08-01 00:01:06           0  POINT (26.1393 44.47633)           NA     
8  2020-08-01 00:01:08           0  POINT (26.1393 44.47633)           NA     
9  2020-08-01 00:04:39           0  POINT (26.1393 44.47633)           NA     
10 2020-08-01 00:04:53           0  POINT (26.1393 44.47633)           NA

intersection of points and lines is notoriously tricky due to floating point math & inaccuracies in decimal places involved. Consider sf::st_nearest_feature() to overcome this https://r-spatial.github.io/sf/reference/st_nearest_feature.html — Jindra Lacko, Oct 13 '22 at 17:21
Would you mind adding a relevant subset of s2map and GPS_1 via dput() to your original question so one has actual data to toy around with? — dimfalk, Oct 14 '22 at 11:09

score 3 · Accepted Answer · answered Oct 14 '22 at 11:31

Here's a tiny example:

library(sf)

Make a little data frame of 3 lines called A, B and C:

set.seed(123)
d = data.frame(x=c(1:4, 5:8, 9:12), y=sample(1:12),
               id = c(rep("A",4),rep("B",4),rep("C",4)))

Convert to an sf spatial data frame using the useful function in the sfheaders package (install if you don't have it yet):

library(sfheaders) # very useful sf_ functions here
spd = sf_linestring(d, x="x",y="y", linestring_id="id")
plot(spd$geometry)

Now let's make some random points in the area. I'm sticking to integers here so we might get points exactly on the lines (which we will, otherwise I would have used a different seed value. If you get different answers set the seed and run this all again from the start):

pts = st_as_sf(data.frame(x=sample(1:12, 20, TRUE), y=sample(1:12, 20, TRUE), id=1:20), coords=1:2)
plot(pts$geometry, add=TRUE)

Now find the nearest line feature to each point feature:

nn = st_nearest_feature(pts, spd)

That's a vector of row indices, so we can grab the line ID and stick it on the point data frame:

pts$did = spd$id[nn]

And we can compute the by-element distance to the nearest feature and add that to the data frame:

pts$D = st_distance(pts, spd[nn,], by_element=TRUE)

Now you can apply some threshold to the distance to select points that are "on" lines - in this case I've got three that are exactly on lines:

> pts$D
 [1] 3.6055513 0.7844645 0.3721042 1.4142136 0.0000000 2.8284271 1.1313708
 [8] 2.0000000 0.8944272 0.7071068 0.1414214 0.0000000 2.2360680 1.2727922
[15] 0.1240347 0.0000000 0.2208631 0.9899495 1.4142136 1.0000000

But to be sensible I should select under a tiny distance relative to my scale:

> pts[pts$D < 1e-5,]
Simple feature collection with 3 features and 3 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 4 ymin: 2 xmax: 9 ymax: 9
CRS:           NA
   id    geometry did D
5   5 POINT (9 9)   C 0
12 12 POINT (7 5)   B 0
16 16 POINT (4 2)   A 0

And you can check those points on the plot - one on each line. Add axis(1);axis(2) for reference.

Note this only uses sf and sfheaders, and should work on "tibbles" although I prefer to work with data frames and data tables for simplicity and speed. There's no other packages needed, also for simplicity and speed.

R - Identify on which street a point is located based on geographic coordinates

Create list of simple feature geometries (linestrings)

Create list of simple feature geometries (linestrings)

Create simple feature geometry list column

Convert to `sp` object if needed

- create a sf object from the `sfc` list of linestrings

1 Answers1