0

I am working with the R programming language.

I am trying to calculate the geographic centroids of different polygons within Canada.

I downloaded the following shapefile and tried to calculate and visualize the centroids of each polygon:

library(dplyr)
library(sf)
library(data.table)
library(rvest)
library(leaflet)
library(ggplot2)
library(urltools)
library(leaflet.extras)
library(stringr)
library(magrittr)

Download zip files

url_1 <- "https://www12.statcan.gc.ca/census-recensement/alternative_alternatif.cfm?l=eng&dispext=zip&teng=lada000b21a_e.zip&k=%20%20%20151162&loc=//www12.statcan.gc.ca/census-recensement/2021/geo/sip-pis/boundary-limites/files-fichiers/lada000b21a_e.zip"

download.file(url_1, destfile = "lada000b21a_e.zip")

Extract zip files

unzip("lada000b21a_e.zip")

Read shapefiles

ada <- st_read("lada000b21a_e.shp")

shapefile_1 = ada %>% st_transform(32617) #sf_cent <- st_centroid(shapefile_1)

sf_cent <- st_point_on_surface(shapefile_1)

Transform the centroids to the WGS84 CRS

sf_cent_geo <- st_transform(sf_cent, crs = 4326)

Extract the longitude and latitude coordinates of the centroids

lon <- st_coordinates(sf_cent_geo)[,1] lat <- st_coordinates(sf_cent_geo)[,2]

ADAUID <- sf_cent_geo$ADAUID lon <- st_coordinates(sf_cent_geo)[,1] lat <- st_coordinates(sf_cent_geo)[,2]

shapefile_1 = ada %>% st_transform(32617) sf_cent <- st_centroid(ada)

ggplot() + geom_sf(data = shapefile_1, fill = 'white') + geom_sf(data = sf_cent, color = 'red')

However, when I examine the results:

enter image description here

When I examine the results, I see that there are multiple centroids within each polygon.

I tried to do some research and consult other references on this topic (e.g. How to calculate polygon centroids in R (for non-contiguous shapes), Finding one centroid instead of multiple centroids using R), but so far I am unable to figure out how to resolve this problem.

For logical purposes, I am trying to only have one centroid in each polygon.

How can I fix this?

Vince
  • 20,017
  • 15
  • 45
  • 64
stats_noob
  • 145
  • 12
  • As a side note, some shapes will have centroids that are outside their shape. To avoid this you could use a tool like QGIS's Point on surface and check the create point on each surface box. – John Apr 20 '23 at 13:57

1 Answers1

4

You do only have one centroid per polygon. You just cant see all the polygons on this map.

Take this bit, which looks like six centroids in a polygon.

enter image description here

But zoom in and you'll see its actually got five polygons inside it:

enter image description here

So your six points are five centroids for those holes, and one for the main polygon. These six regions have different ADAUID codes, so they are different entities in some sense, and are different rows in the spatial data.

enter image description here

There's a neighbouring part that has an even more complicated-looking structure:

enter image description here

But your stated problem, "Multiple Centroids in a Polygon?" isn't a problem! You've got one point per polygon, but more polygons than you thought you had.

If you want to somehow dissolve out these small polygons, then that's a new question. Maybe there's a field you can dissolve on, or maybe you need to keep them all in.

I've used QGIS to look at the shapefile, its the best way to quickly inspect new spatial data (free, open source).

Spacedman
  • 63,755
  • 5
  • 81
  • 115
  • @ Spacedman: thank you so much for your answer! So just to clarify - the problem of multiple centroids in a polygon is not even occurring? if I keep zooming in, I will see that there is in fact a single centroid in each polygon? thank you so much! – stats_noob Apr 21 '23 at 14:12
  • Yes. Every red dot is one polygon, every polygon has one red dot - sometimes the polygons are so small they are hidden either by the red dot itself or by the polygon being smaller than a pixel on your screen. – Spacedman Apr 21 '23 at 15:06