0

I have some code that calculates deviation from the mean for a series of univariate data. It calculates the mean and stddev for a window of data points, and compares the latest datapoint using $\mu$ and $\sigma$.

Pretty simple, my question is how do I extend this to higher dimensions. Basically if my datapoints are now vectors $(x1,x2,x3)$ say, how can I find how far a datapoint is from the mean?

pnadeau
  • 101
  • If I understand your problem correctly, you need to calculate the covariance matrix for your vectors. You can then use the Mahalanobis distance; see https://en.wikipedia.org/wiki/Mahalanobis_distance. – GCru Oct 28 '22 at 12:59
  • Ok, so supposing I used scipy for this, https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.mahalanobis.html, then it looks like the scipy implementation compares two vectors, not a vector against a distribution. – pnadeau Oct 28 '22 at 13:58
  • You can select one of the vectors to be your mean $(\bar{x_1},\bar{x_2}, \bar{x_3}$. The covariance matrix is calculated from the three vectors $\mathbf{x}_1$, $\mathbf{x}_2$, $\mathbf{x}_3$ containing your three data series. – GCru Oct 28 '22 at 14:37
  • Do I have to find a new basis for the point cloud as illustrated here: https://stats.stackexchange.com/a/62147/350353 or is that what the covariance matrix is doing here? – pnadeau Oct 28 '22 at 17:38
  • You use the covariance matrix as calculated from your data. – GCru Oct 28 '22 at 18:28

0 Answers0