19

I participate in a long-distance cycling club, and we started to collect GPS data routinely from our riders.

My interest is to calculate "the real trajectory" for future events based on accumulated GPS data over the same roads. Basically, this would mean to pass some pre-selected tracks to an algorithm, and the algorithm would generate points at an appropriate sample rate (an appropriate distance from one another depending on road curves). I will discard timestamps, taking only spatial track information into account.

Which algorithm/statistic methods could I use? I don't use any GIS package and I plan to implement this in Python.

Below, some sample trajectory sets:

enter image description here

enter image description here

PolyGeo
  • 65,136
  • 29
  • 109
  • 338
heltonbiker
  • 1,237
  • 1
  • 12
  • 30
  • Interesting project - quite similar to an inspection algorithm I wrote years ago. Since I'm lazy, I can only offer a few hints. Most important factors are direction of travel, signal quality and your velocity (ie. if you're just standing around, it's not a road). Best to first cull the points that are too far off that way. Other than that, I'd apply a smoothing algorithm (try DP) then average the lines. – nagytech Sep 04 '13 at 23:11
  • DP = Dynamic Programming right? Wikipedia gave me a long homework reading on this for tonight... Thanks for now! – heltonbiker Sep 04 '13 at 23:53
  • An interesting, related question is this: http://gis.stackexchange.com/questions/42224/how-to-create-a-polyline-based-heatmap-from-gps-tracks?rq=1 – heltonbiker Sep 04 '13 at 23:55
  • Something really, REALLY worth checking is your GPS settings - some GPS units "snap" your position to the closest road in the GPS database, even if the real road is 10+m to the side. – Simbamangu Sep 06 '13 at 05:08
  • @Simbamangu that would be a very nice thing to have indeed. I believe the software I am using today in an android phone doesn't have that. But anyway, most of my tracks were collected by other people in the past months. Thanks for the tip! – heltonbiker Sep 06 '13 at 12:31
  • There is a similar question here: https://gis.stackexchange.com/questions/123731/create-mean-line-from-multiple-lines-using-qgis/271532#271532. I have two ideas regarding to this problem. Maybe this could help. – Stefan Feb 15 '18 at 09:09

1 Answers1

13

Chris Brunsdon gave a paper on this issue at the 2008 GeoComputation conference - see http://www.geocomputation.org/2007/1B-Algorithms_and_Architecture1/1B2.pdf

In the paper he discusses how to apply Principal Curve Analysis (Hastie and Stuetzle 1989) and makes some suggestions on how to increase robustness of the method. Further searching leads to a discussion of a OSM tool called osm-makeroads that may well solve your problem (or at least get you started).

Ian Turton
  • 81,417
  • 6
  • 84
  • 185
  • Gonna take a look and give some feedback soon! Thanks for now! – heltonbiker Sep 05 '13 at 12:51
  • 4
    +1 Nice reference. It needs work though, because it overlooks a fundamental issue with GPS traces: the errors are not independent from one point to the next. Instead, the GPS error made at one point will tend to be very similar to the error made at the next point on the same trace. You can see this in Brunsdon's illustrations: the problematic (outlying) points clearly lie on one or two exceptional traces; they are neither sporadic nor random. Thus there is great potential for improvement by modeling this autocorrelation and adjusting for it in the algorithm. – whuber Sep 05 '13 at 16:47
  • 2
    @whuber agreed. Something most algorithms fail to consider (Principal Curves being one of them, as I already found out), is that GPS track sets are not point clouds, but rather "linestring" clouds. They are indeed connected vectors or something like that. Taking only points into consideration generates a bias towards tracks with higher sample rate, instead of the regions where LINES from distinct tracks are more dense... – heltonbiker Sep 06 '13 at 12:33
  • 2
    This conversation is continued in a related thread at http://stats.stackexchange.com/questions/69329. – whuber Sep 06 '13 at 17:59
  • @whuber In this answer I have written down an idea, that came to my mind, to do this with the help of a heat map. I appreciate any suggestions. – Stefan Feb 15 '18 at 09:17