2

Background

Circular correlation is given by:

$$R_{\operatorname{circular}} \triangleq \frac{\sum_{i=1}^m \sin (x_i - \bar x) \sin (y_i - \bar y)}{\sqrt{\sum_{i=1}^m \sin (x_i - \bar x)^2} \sqrt{\sum_{i=1}^m \sin (y_i - \bar y)^2}}$$

Example 1

I generated a data set with a circular correlation of 1.0000 and it looked like this:

enter image description here

Example 2

I similarly generated a dataset with near-perfect circular correlation, but the initialization of parameters included a great dispersion in where to place points:

enter image description here

Example 3

Dialing up the amount of dispersion and the number of points, we get something that to my eye doesn't have any grid-like structure.

enter image description here

Zooming in doesn't seem to reveal any fine grid structure either:

enter image description here

And zoom in further:

enter image description here

And further:

enter image description here

Question

Is circular correlation maximized by points laying on a square grid on the plane?

Galen
  • 8,442
  • Used similar methods of data generation here: https://stats.stackexchange.com/a/589221/69508 – Galen Sep 19 '22 at 03:51
  • 1
    Is the unit radians? In that case, probably all of these are a line on the bivariate plot after performing $\theta mod 2 \pi$. – Kees Mulder Sep 19 '22 at 10:45
  • 1
    Hint: when $a \approx 0,$ $\sin(a) \approx a.$ Consider, then, a dataset where the $x_i$ and $y_i$ don't vary much. To a good approximation, then, you can drop all appearances of "sin" from the formula, giving the usual correlation for $(x_i,y_i).$ What kind of configuration gives rise to large correlations? – whuber Sep 19 '22 at 13:03
  • 1
    I think I see what the real problem is: your formula is incorrect. In all your examples, the definition of the mean angle is problematic, because it requires arbitrary choices of the phase for its definition. Instead, there's a version inspired by the characterization of covariance I give at https://stats.stackexchange.com/a/18200/919 as a mean over all possible pairs of data of a product of the sines of their differences. See formula (2.2) in Fisher, N. I., and A. J. Lee. 1983. “A Correlation Coefficient for Circular Data.” Biometrika 70 (2): 327–32. https://doi.org/10.2307/2335547 . – whuber Sep 20 '22 at 11:52

0 Answers0