3

I have a bunch of points (lat+lon coordinates). Each one has a weight. I would like to get a weighted mean centroid of the points, taking into account the spheroid (as my points are scattered across the United States and I want to roughly adjust for the curvature of the earth).

Say I have these data in R:

pts <- data.frame(x=c(-100.5, -98.6, -98), y=c(35, 41, 44), weight=c(20, 15, 100))

How can I take these points and find what I am looking for?

I do not understand sp, sf, etc. very well. But based on this post (Finding centroid of cluster of points using R in particular the answer by @Ian Turton and @whuber's comment), it appears that we need to do these steps (which I am not really sure how to do):

  1. Convert the points to sp or something similar
  2. Go from lat+lon to a 3d projection (geocentric coordinates). (Possibly using @whuber's answer to this question? Euclidean and Geodesic Buffering in R)
  3. Take the weighted average (mean) of the points. This produces the weighted centroid.
  4. Convert the weighted centroid back into latitude and longitude.
bill999
  • 125
  • 6
  • what variable does weight represent? population? – Elio Diaz Aug 19 '22 at 00:00
  • @ElioDiaz, each dot is a university and the weights are number of graduates. – bill999 Aug 19 '22 at 04:36
  • https://en.wikipedia.org/wiki/Center_of_population describes 3 method. I use the 3D coordinates and then project (which might be one of the methods mention there) – Barry Carter Aug 19 '22 at 16:52
  • Covert to unit sphere vector space - makes things a lot easier. Alternatively, bin to a spherical grid system and apply calculations on aggegated values over cell indices - somewhat biased, but libraries like H3 make this rather convenient. – geozelot Aug 24 '22 at 19:58
  • 1
    Wouldn't you also respect the curvature of the earth when determining a weighted centroid by simply averaging your lat/lon values based on the WGS 84 ellipsoid (EPSG: 4326)? Not really sure why you should reproject to EPSG: 4328 in the first place here. Just to be 100 % consistent with distances from st_distance() and nngeo::st_nn()? – dimfalk Aug 24 '22 at 20:19
  • @falk-env, you very well may be right (and this is good to know!). I know how to do a weighted average of lat/lons. But how would I do this based on the WGS 84 ellipsoid? – bill999 Aug 24 '22 at 21:44
  • Wouldn't you just average your x- and y-coordinates which are given in EPSG: 4326? See also this accepted response. The approach might not be applicable globally but should work well in the US. – dimfalk Aug 25 '22 at 11:42
  • I'm pretty sure any method using only two coordinates is not account to account for the Earth's near-spherical shape. However, I can't back that up. – Barry Carter Aug 25 '22 at 13:30
  • @falk-env, I guess I had seen @ whuber 's comment on the accepted answer you link to, which says: "+1 Great solution. It extends to centroids on the spheroid, too (which is essential for avoiding projection-related distortions when the points are spread over a large portion of the globe): first convert (lat, lon) to 3D (x,y,z) (geocentric) coordinates, average them, then convert the result back to (lat, lon) (ignoring the almost inevitable fact that the 3D average will be deep below the surface)." But maybe doing this won't make a difference in practice. – bill999 Aug 25 '22 at 13:34
  • My answer to https://opendata.stackexchange.com/questions/15731/how-to-find-the-country-with-the-northernmost-population may or may not be helpful – Barry Carter Aug 25 '22 at 13:36
  • @BarryCarter: Neither can I, but since lat/lon values are angles, would it be a mistake to average 30° N and 50 °N, resulting in 40 °N? Sorry for my ignorance, can't wrap my head around this somehow. – dimfalk Aug 25 '22 at 16:24
  • No, that would be correct because each degree of latitude on a sphere is the same length. For a given latitude, each degree of longitude is also the same length. The problem comes when both are different (I think) – Barry Carter Aug 25 '22 at 16:25
  • @bill999: Yeah, I saw this too - and my interpretation was: You might extend this if you want to and also need to. I would not expect large distortions when working in the US only, but that's just my very subjective statement. This approach would be limited when averaging coordinates like e.g. -179 and +179, but this should not be the case here. – dimfalk Aug 25 '22 at 16:29
  • The problem I see is the scarcity of the data; only 3 points and the longitude varies by only 2.5 degrees while the latitude varies by almost 20 ! this ends up looking like a long thin vertical triangle which, according to Delaunay, will yield poor interpolation along the longitude axis. Maybe you gave is only a small example and have a larger, continent-wide set? – JasonInVegas Aug 27 '22 at 21:46

0 Answers0