1

I have done some analysis of various pairs of tickers on the NYSE. I did a brain dead algorithm to come up with all combinations of pairs and then checked all pairs for cointegration/stationarity.

I found one pair that backtested to 40% returns over the last year. But the components of the pair are totally unrelated, not even in the same sector, I cannot dream up a scenario that explains the cointegration. So I guess it's down to luck and I should leave it alone.

How should I have gone about my search? Start with a particular sector? Is there a systematic way of finding candidate pairs?

brownie74
  • 99
  • 1
  • 5
  • This in an example of the high chance of making one or more false discoveries when you test a large number of hypotheses/experiments. Statisticians speak of a high "family wise error rate" (FWER) even though each test/experiment has low error rate. It is not easy to deal with this problem, which shows up whenever you do massive testing. – nbbo2 Apr 18 '21 at 19:36
  • 1
    Another point to keep in mind, it depends upon the confidence of your test. Say you find 20 cointegrated pairs with a 5% confidence intervals. This tells us that 1 of these pairs will be a false positive and no significant cointegration relationship actually exists. – Hamish Gibson Apr 19 '21 at 00:09
  • I have extended my program by the following. I would love some feedback. I use PCA on the returns of the bunch of stock tickers, this will throw up proportionally how much each stock is contributing to the returns. I then fed that into k-means clustering in order to form clusters of similar profile stocks (similar - in terms of their contribution to returns). I then picked stocks that are part of the same cluster. I then did my cointegration/stationarity tests on these similar stocks only. Is this a better approach? – brownie74 Apr 21 '21 at 05:51

0 Answers0