Methods For Measuring Non-Linear Correlation?

Question

I have been learning about standard methods in Statistics such as the Pearson's Correlation Coefficient, Spearman's Correlation and Kendall's Tau.

My understanding of this so far is that:

Pearson Correlation Coefficient measures the linear correlation between two sets of data
Spearman's Correlation measures the "monocity" between two sets of data (e.g. do they both increase and decrease at the same time?)
Kendall's Tau measures the ordinal association between two sets of data - supposedly Kendall's Tau is similar to the Spearman Correlation, but Kendall's Tau has a more logical confidence intervals.

I had the following question - can any of these methods be used for measuring a specific form of "Non Linear Correlation" between two sets of data?

For example - suppose I want to see how strongly two sets of data are correlated relative to a "second order curve" :

Is there something that could measure the "curved correlation"?

The two ideas I came up with:

Try to use some data transformations (e.g. Log) to transform one of the variables into a more linear pattern that will make it suitable for one of the above measures
Fit a polynomial regression model (of order 2) to this data and measure the MSE

But I am not sure if either of these approaches are suitable.

Interesting question. Some of the trouble of defining a curved correlation will be deciding on what kind of curvature you want to measure. After all, a logarithm-type of graph has different curvature than a quadratic. Further, determining the sign will be challenging, since many curves (such as quadratics) allow for increasing and decreasing sections. I’ve wondered if the concavity of a parabola (up-opening vs down-opening) could be used for this, but parabolas are just one type of curve. (Maybe you can do this if you restrict to convex or concave functions.) — Dave, Nov 09 '22 at 06:59
(1) What do you mean by "measuring"? If you want a measure of the "strength" of such a correlation, then you could indeed run a polynomial regression and report the MSE. Possibly cross-validated, otherwise if you re-ran this for higher order polynomials, you would "find" that the "second-order correlation" is smaller than the "third-order correlation" and so on. Conversely, if you want to do statistical inference, the null and alternative hypotheses will need some thinking about - are $x$ and $x^3$ for $-1<x<1$ "significantly second order correlated"? ... — Stephan Kolassa, Nov 09 '22 at 07:41
... (2) Especially for inference, the question comes up whether you want to test a specific polynomial correlation, or a general second-order polynomial, or a general polynomial of up to second order. Perhaps you could explain what you want to do with such a nonlinear correlation? — Stephan Kolassa, Nov 09 '22 at 07:42
Another way to consider Stephan's comments is that every regression you could estimate for the two variables in your plot is, in a sense, a correlation measurement. Testing and comparing arbitrarily many regressions has problems with false discovery and statistical validity, so "just try stuff" isn't a great way to go about it: you need to be specific about what questions you want to ask your data & how you want to ask it. The plot you show is roughly monotonic & Spearman's correlation would characterize the extent. Lots of nonlinear functions are monotonic, so Spearman's is an answer. — Sycorax, Nov 10 '22 at 03:28

score 1 · Answer 1 · answered Jan 18 '24 at 18:02

1

You may be interested in distance correlation.

distance correlation measures both linear and nonlinear association between two random variables or random vectors. This is in contrast to Pearson's correlation, which can only detect a linear association between two random variables.

https://en.wikipedia.org/wiki/Distance_correlation

answered Jan 18 '24 at 18:02

Marcus Chiu

11

1

For more about distance correlation, click here – kjetil b halvorsen Feb 24 '24 at 04:49

Methods For Measuring Non-Linear Correlation?

1 Answers1