1

I am working with valid lat/lon point data, creating buffers, and trying to calculate overlapping areas from buffers and running into the not-so-rare error with non-noded intersections when using st_intersection with many polygons.

I have reviewed many of the related questions around this error here and on the github page for the sf R package. There are a number of solutions mentioned to this error: using valid geometry, making invalid geometry valid, zero-buffers, and precision setting, and none have helped so far.

In some cases, people have issues because the geometry they are using has errors or is invalid for some reason, but this should not be the case here. Each object is simply a created buffer around a point coordinate.

A recent post discussed a similar issue and resolved it by using rasters to calculate overlapping areas. I would prefer to avoid using rasters and stick to vectors because it is slow and I have to scale this process up, if I can fix it.

Expected process - this works just fine:

library(sf)
library(dplyr)

Sample data

data(meuse, package = "sp") meuse <- st_as_sf(meuse[1:50, 1], coords = c('x','y'))

Create buffers

meuse_buffs <- st_buffer(meuse, 100)

Create intersections

meuse_int <- meuse_buffs %>% st_intersection()

Visualize

plot(meuse_int[1])

enter image description here

Reproducible data and error

The data is a collection of lon/lat points with coordinates centred around Singapore. I have made a copy of just the coordinates and stored them in a Github gist for reproducibility.

sample_data_loc <- "https://gist.githubusercontent.com/dshkol/d19e56fb52165fa666c78f233e73b6cf/raw/307f27139a6b44592dcfd1cad36eaa76b8473002/bt_sample.csv"
sample_data <- readr::read_csv(sample_data_loc)

Errors and issues resolving them

sample_dots <- st_as_sf(sample_data, 
                    coords = c("longitude","latitude"), 
                    crs = 4326) 

I reproject into a local projection in metres: SVY21 (EPSG 3414) and create buffers around the points.

sample_dots <- st_transform(sample_dots, crs = 3414) 
sample_buffs <- st_buffer(sample_dots, 3000)

enter image description here

So far so good. Where we get errors is when we try to do the self-intersections to calculate the overlapping areas:

sample_buffs %>% st_intersection()

Error in CPL_nary_intersection(x) : Evaluation error: TopologyException: found non-noded intersection between LINESTRING (27870.7 32791.6, 27873.3 32791.2) and LINESTRING (27874.1 32791, 27872.6 32791.3) at 27872.716126055431 32791.258798398761.

A mentioned solution is to apply a zero-distance buffer to resolve potential geometry issues. This doesn't help, and leads to the same error.

sample_buffs %>% st_buffer(0) %>% st_intersection() # Fails

Another cited solution is to check and ensure that the geometry is valid and to adjust the precision attribute of the object. This generally does not work until the precision is so low that the original objects are unrecognizable.

sample_buffs %>% st_set_precision(1e7) %>% st_make_valid() %>% st_intersection() # Fail
sample_buffs %>% st_set_precision(1e6) %>% st_make_valid() %>% st_intersection() # Fail
sample_buffs %>% st_set_precision(1e5) %>% st_make_valid() %>% st_intersection() # Fail
sample_buffs %>% st_set_precision(1e4) %>% st_make_valid() %>% st_intersection() # Fail
sample_buffs %>% st_set_precision(1e3) %>% st_make_valid() %>% st_intersection() # Fail
sample_buffs %>% st_set_precision(1e2) %>% st_make_valid() %>% st_intersection() # Fail 
sample_buffs %>% st_set_precision(1e1) %>% st_make_valid() %>% st_intersection() # Fail
sample_buffs %>% st_set_precision(1e0) %>% st_make_valid() %>% st_intersection() # Fail
sample_buffs %>% st_set_precision(1e-1) %>% st_make_valid() %>% st_intersection() # Fail
sample_buffs %>% st_set_precision(1e-2) %>% st_make_valid() %>% st_intersection() # Fail
sample_buffs %>% st_set_precision(1e-3) %>% st_make_valid() %>% st_intersection() # Works

plot((sample_buffs %>% st_set_precision(1e-3) %>% st_make_valid() %>% st_intersection())[1])

enter image description here

Alternatively, the dput structure for the sample data is below:

sample_data <- structure(list(latitude = c(1.35058, 1.352333, 1.334565, 1.339028, 
1.392002, 1.264667, 1.300723, 1.369104, 1.299604, 1.350686, 1.315011, 
1.283737, 1.334332, 1.286145, 1.332621, 1.448384, 1.293461, 1.324592, 
1.285209, 1.300675, 1.330994, 1.436093, 1.392011, 1.30147, 1.334502, 
1.317711, 1.311398, 1.42959, 1.342806, 1.301016, 1.317368, 1.29133, 
1.294149, 1.40669, 1.304629, 1.32036, 1.279761, 1.303778, 1.327052
), longitude = c(103.872879, 103.944692, 103.962672, 103.705937, 
103.904984, 103.821703, 103.838455, 103.848957, 103.855718, 103.84853, 
103.764356, 103.859192, 103.889569, 103.827342, 103.848021, 103.819187, 
103.832064, 103.929263, 103.844723, 103.838508, 103.795183, 103.785947, 
103.895006, 103.905155, 103.742698, 103.843491, 103.85659, 103.835769, 
103.952974, 103.845411, 103.892638, 103.850074, 103.852807, 103.902174, 
103.832566, 103.843845, 103.853162, 103.835536, 103.846484), 
    id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 
    16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 
    31, 32, 33, 34, 35, 36, 37, 38, 39)), class = c("spec_tbl_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -39L), spec = structure(list(
    cols = list(latitude = structure(list(), class = c("collector_double", 
    "collector")), longitude = structure(list(), class = c("collector_double", 
    "collector")), id = structure(list(), class = c("collector_double", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
    "collector")), skip = 1), class = "col_spec"))
dshkol
  • 141
  • 6
  • Are you able to post the WKT of the transformed buffer circles? I would like to see if I can reproduce this error in JTS/GEOS. – dr_jts Oct 09 '20 at 00:59
  • It was too large to paste, so I stored the WKT here https://gist.github.com/dshkol/1297611ce388e5334e00dba8ef4169fc. The projection wkt for what I am using is here is here https://gist.github.com/dshkol/d84f134e39d17418797c6df6921ecb1a – dshkol Oct 09 '20 at 01:50
  • Thanks. Any chance of getting the projected polygons? – dr_jts Oct 09 '20 at 03:30
  • Apologies, but not sure I understand. The WKT I shared was the transformed buffer circle polygons, in the transformed projection. – dshkol Oct 09 '20 at 03:44
  • Ok, I see that now. I happened to be looking at a coordinate that looked like lat/long. Anyway, I can't reproduce the failure - likely because of the details of the sf algorithm. Hopefully the new overlay code will fix it. – dr_jts Oct 09 '20 at 04:13
  • Thanks @dr_jts for taking a look nonetheless. I think in the meantime I will explore ways to manipulate the geometry to find a balance between precision and avoiding some of the pitfalls you mentioned in your answer below. – dshkol Oct 09 '20 at 07:47

2 Answers2

2

Thanks to the extensive help of @mdsumner, this solution works by using the polymer R package to break polygons down into a mesh of triangles, calculating the overlapping triangle segments, and then reassembling the triangles into polygons.

This approach is slower but it seems to generally be robust to these non-noded intersection errors which here stem from some of the tiny slivers in slightly overlapping circles. This solution requires the polymer library and it's dependencies, as well as the sfheaders library. These are largely in developmental stages as of writing.

library(polymer)

mesh <- polymer(sample_buffs) > mesh polymer mesh: Layers: 1 Polygons: 39 Triangles: 9350 (Overlaps: 7404)

What does this polymer object look like? A mesh of individual triangles from the triangle decomposition of each polygon, worthy of submission to @accidental__aRt. enter image description here

We calculate the number of overlapping regions to get the maximum number of overlaps.

ugroups <- mesh$index %>% 
  group_by(triangle_idx) %>% 
  tally() 

The next part is a custom function @mdsumner put together that assembles the triangle meshes back into regions overlapping specific combinations of input polygons.

tri_to_sf <- function(xx, idx = NULL) {
    tris <- xx$T
    if (!is.null(idx)) {
      tris <- tris[idx, , drop = FALSE]
    }
dat &lt;- tibble::tibble(x = xx$P[c(t(tris)),1], y = xx$P[c(t(tris)), 2], 
                    linestring_id = rep(seq_len(nrow(tris)), each = ncol(tris)))

sfheaders::sf_polygon(dat, x = &quot;x&quot;, y = &quot;y&quot;, 
                    linestring_id = &quot;linestring_id&quot;, 
                    polygon_id = &quot;linestring_id&quot;) 

}

And then this function is applied to aggregate new regions into regions based on the number of overlaps, and assembled into a single sf class object containing a unique polygon for overlaps numbering 1 to n where n is the maximum number of overlaps.

res <- do.call(rbind, purrr::map(split(ugroups, ugroups$n), 
                                   ~{
                                     out <- tri_to_sf(mesh$primitives,  .x$triangle_idx)
                                     out$overlaps <- .x$n[1L]
                                     dplyr::summarize(dplyr::group_by(out, overlaps))
                                     }
                                   )
                 )

This is in effect the same output as we would have gotten from sample_buffs %>% st_intersection() but without the error.

> res
Simple feature collection with 12 features and 1 field
geometry type:  MULTIPOLYGON
dimension:      XY
bbox:           xmin: 9194.851 ymin: 24403.28 xmax: 45087.08 ymax: 50771.18
CRS:            NA
# A tibble: 12 x 2
   overlaps                                                                              geometry
 *    <int>                                                                        <MULTIPOLYGON>
 1        1 (((28005.05 24662.64, 27859.95 24602.54, 27711.89 24550.11, 27561.3 24505.5, 27408.5…
 2        2 (((29211.89 25639.92, 29180.28 25598.72, 29133.11 25597.49, 28976.1 25601.6, 28819.5…
 3        3 (((29719.29 26779.54, 29717.2 26770.87, 29580.24 26752.84, 29423.67 26740.52, 29266.…
 4        4 (((28257.23 26982.75, 28152.66 26951.78, 28046.45 26995.77, 27904.69 27063.39, 27766…
 5        5 (((29451.5 27714.6, 29399.95 27665.68, 29318.08 27704.73, 29180.06 27779.68, 29046.1…
 6        6 (((29559.62 27828.53, 29523.98 27790.98, 29391.66 27822.75, 29241.06 27867.36, 29093…
 7        7 (((29084.59 27968.83, 29029.61 27946.05, 28947.9 27979.89, 28806.14 28047.51, 28763.…
 8        8 (((29487.26 28455.7, 29596.67 28447.09, 29608.45 28413.81, 29474.77 28453.41, 29457.…
 9        9 (((29585.58 28478.38, 29596.67 28447.09, 29487.26 28455.7, 29457.08 28459.67, 29326.…
10       10 (((28607.45 28928.49, 28582.47 28922.49, 28513.86 28975.14, 28394.43 29077.14, 28280…
11       11 (((27843.08 29820.91, 27834.57 29756.28, 27803.74 29806.58, 27728.8 29944.61, 27705.…
12       12 (((30124.68 30712.28, 30016.7 30750.51, 29871.59 30810.62, 29729.83 30878.24, 29591.…

enter image description here

dshkol
  • 141
  • 6
1

These kinds of problems are due to the fact that the current GEOS overlay (intersection in this case) algorithm is not totally robust. It can fail on some data, typically that which contains nearly-coincident linework. It is probably that the buffers do contain these situations, if the original points are close.

Happily, a much-improved overlay algorithm is soon to be released in JTS 1.18 and GEOS 3.9. See my blog posts here and here, and this GEOS thread. It should make its way into R, and then hopefully these problems will no longer occur.

dr_jts
  • 5,038
  • 18
  • 15
  • Using sf compiled against 3.9.0dev (github, yesterday) still leads to the same errors. Only setting precision to 1e-2 (rounding to 100m) succeeds now. – Edzer Pebesma Oct 15 '20 at 12:13
  • @EdzerPebesma might be best to open a rsf issue for this then. I was able to compute intersections for the provided data successfully in JTS (which should be the same as GEOS). But I think R is using a more complex algorithm to decide which intersections to compute. – dr_jts Oct 15 '20 at 21:02