Speed up Pandas EWM pairwise correlations

Question

I have a df called "log_returns_5d". Here's a small sample:

              Gasoil  Gasoline       Oil       Gas
Date                                              
2011-01-07  0.009059 -0.007125 -0.037390  0.003793
2011-01-10 -0.013459  0.011151 -0.025067 -0.054915
2011-01-11  0.032774  0.027211  0.019039 -0.040233

For all assets (column headers) I want to calculate pairwise ewm correlations. I tried:

st_correlations = log_returns_5d.ewm(alpha=0.01, adjust=False).corr(method='pearson', min_periods=2000)

but I got a 3-dimensional df, with pairwise correlations being calculated for every single date:

                                         Gasoil       Gasoline  \
Date                                                             
2010-12-31 Gasoil                           NaN            NaN   
           Gasoline                         NaN            NaN   
           Oil                              NaN            NaN   
           Gas                              NaN            NaN   
           Canola                           NaN            NaN   
...                                         ...            ...   
2021-07-28 CNHJPY                      0.295644       0.189685   
           SEKJPY                      0.452436       0.418866   
           NOKJPY                      0.588999       0.591803   
           NOKSEK                      0.497179       0.542514   
           MXNJPY                      0.322246       0.349632

In order to get only the most updated correlations, I wrote the following:

st_correlations = st_correlations.reset_index()
max_date = st_correlations.Date. max() 
st_correlations = st_correlations.loc[(st_correlations['Date'] == max_date)]

which works. The whole code is quite slow to run though. Is there a way to get what I need from the begininning?

I think my problem is similar to the one described here, but I can't get to apply the proposed solution.

Speed up Pandas EWM pairwise correlations

0 Answers0