35

While searching the web, solutions for finding centroids of polygons come up rather often. What I'm interested in is finding a centroid of a cluster of points. A weighted mean of sorts.

Can you provide some pointers, pseudo code (or even better, an R package that has already solved this) or links of how this issue can be tackled?


@iant has suggested a method to average coordinates and use that for the centroid. This is exactly what crossed my mind when I saw the right picture on this web page.

Here is some simple R code to draw the following figure that demonstrates this (× is the centroid):

xcor <- rchisq(10, 3, 2)
ycor <- runif(10, min = 1, max = 100)
mx <- mean(xcor)
my <- mean(ycor)

plot(xcor, ycor, pch = 1)
points(mx, my, pch = 3)

enter image description here


cluster::pam()$medoids returns a medoid of a set of cluster. This is an example from @Joris Meys:

library(cluster)
df <- data.frame(X = rnorm(100, 0), Y = rpois(100, 2))
plot(df$X, df$Y)
points(pam(df, 1)$medoids, pch = 16, col = "red")
PolyGeo
  • 65,136
  • 29
  • 109
  • 338
Roman Luštrik
  • 733
  • 2
  • 8
  • 15

3 Answers3

51

just average the X and Y coordinates (multiply by a weight if you want) and there is your centroid.

Ian Turton
  • 81,417
  • 6
  • 84
  • 185
0

To calculate the centroid of a cluster of points in R, you can use simple statistical methods to find the mean of the x and y coordinates. This approach is effective for an unweighted centroid. If you're looking for a weighted centroid, you'll need to factor in the weights of each point in your calculations.

Here's a basic method to find the unweighted centroid:

# Sample data
xcor <- c(1, 2, 3, 4, 5)
ycor <- c(5, 4, 3, 2, 1)

Calculating the centroid

centroid_x <- mean(xcor) centroid_y <- mean(ycor)

print(paste("Centroid:", centroid_x, centroid_y))

For a weighted centroid, you would modify the calculation to account for the weights:

# Sample data with weights
xcor <- c(1, 2, 3, 4, 5)
ycor <- c(5, 4, 3, 2, 1)
weights <- c(1, 2, 1, 2, 1)  # Example weights

Weighted centroid calculation

weighted_centroid_x <- sum(xcor * weights) / sum(weights) weighted_centroid_y <- sum(ycor * weights) / sum(weights)

print(paste("Weighted Centroid:", weighted_centroid_x, weighted_centroid_y))

Additional R Package: For more advanced calculations or handling of spatial data, you might consider using the geosphere package which provides a centroid function. This can be particularly useful for geographic data.

library(geosphere)
# Assuming 'points' is a matrix or dataframe of your coordinates
centroid <- centroid(points)

Online Tool for Quick Calculations: As an additional resource, you can use the online Centroid Calculator. This tool provides a simple interface for calculating the centroid of a set of points, which can be handy for quick checks or when working outside of R.

M.K.Dan
  • 1
  • 1
-1

This is excellent. I'd suggest removing outliers before doing this. For simple outlier removal, one might find the longitudes within the 75%-25% percentiles and the same for the latitudes, and only calculate mean on those values? Or for less drastic outlier removal, remove values outside the 1.5 * 75%-25% interquartile range (this is a somewhat standard outlier definition).