4

Let us have a board, and a set of trajectories on that board. Those trajectories are represented as time-dependent curves, $\gamma_i:[0,T_i] \longrightarrow\mathbb{R}^2$, where the $T_i>0$ are non necessarily equal for the different $i$'s (actually, most of the times they will be different). I've read about some ways of comparing geometrical paths, but they only measure the similarity between their graphs, not taking into account the time. Also, we are not interested only about the final graph, but also about how that path has been walked (for instance, maybe the "walker" has gone back and forth several times over the same segment, which would make it a different path that if the walker had not gone through the same path twice).

I would like to be able to measure the similarity in both space and time. That is, $\gamma_i = \gamma_j$ if and only if $T_i=T_j$ and $\gamma_i(t)=\gamma_j(t)$ $\forall t \in [0,T_i]$, so I can use to find clusters among the set of trajectories. Is there any similarity measure like that?

If it helps, most of our paths are a continuous sequence of line segments.

JavierE
  • 41
  • 2

2 Answers2

2

It seems like a good starting point could be the RMS distance between the paths over time: $$ \left\|\gamma_i - \gamma_j \right\|^2 = \int \left| \gamma_i(t) - \gamma_j(t) \right|^2 \, dt $$ One tricky point would be that the paths end at different times $T_i$, but there are some possible ways around that. For example, just let $T = \max_i T_i$ and let $\gamma_i(t) = \gamma_i(T_i)$ for $t > T_i$.

jwimberley
  • 3,974
  • This seems a reasonable approach. I was trying to come up with a more sophisticated way to do it, but this can be a good started. The only thing I don't like is that the same path in the same time, but walked from end to beginning could give a high distance... but I guess that you cannot have it all. – JavierE Sep 27 '16 at 10:31
  • @user266075 Do you mean if two geometrically identical paths are staggered in time? Another option would be to graph the plots in $(\vec x, t)$ spacetime and find the area of the surface spanned between the two paths (if there is a unique way of defining this surface, which my gut says is true). In the 1D case, this would just be the area between the two curves $x_1(t)$ and $x_2(t)$. The issue here is that distance and time have different units, so you'll have to use some characteristic velocity to turn time into a spatial coordinate, \ie\ what the speed of light does in physics spacetime. – jwimberley Sep 27 '16 at 12:06
2

A first approach could be to discretize the time and space variables and produce (sparse) boolean features : "Between time $[t_i,t_i+\delta]$, was the position in the square $[x_j,x_j+\epsilon],[y_k,y_k+\epsilon]$" ?

For small enough $\delta,\epsilon$, the property "$\gamma_i = \gamma_j$ if and only if $T_i=T_j$ and $\gamma_i(t)=\gamma_j(t)$ $\forall t \in [0,T_i]$" will coincide will all the boolean values being equal.

But there must be much more valuable information in some data science competitions related to your problem :

RUser4512
  • 10,217
  • I realy like this, but I'm not sure about the efficiency of the implementation of these method. Thanks anyway. – JavierE Sep 27 '16 at 10:31