0

my problem is to speed up the following code

for i in range(1, len(Vol)):
    if Vol.loc[i, 'RIC']==Vol.loc[i-1,'RIC']:
        Vol.loc[i, 'Sigma'] =Vol.loc[i-1, 'Sigma']*0.88 + Vol.loc[i, 'k*Jump']
    else: 
        Vol.loc[i, 'Sigma'] =Vol.loc[i, 'k*Jump']

which calculates an EMA of a stock price metric "Jump" using preceding value Sigma(-1) and the new current observation. The code should also restart when different stock is encountered next row. The dataframe is a 20mio rows intraday observation of 60 stocks identified by RIC code. Being so many obs it soon becomes tremendously slow. Here below, expected outcome for a few lines

|  Index       |     RIC      |     k*Jump       |    Sigma      |
|:-------------|--------------|------------------|---------------|
|      1       |     AAPL.O   |     0.0789763    |   0.0789763   |
|      2       |     AAPL.O   |     0.395784     |   0.465283    |
|      3       |     AAPL.O   |     0.184731     |   0.59418     |
|      4       |     AAPL.O   |     0            |   0.522878    |
|      ...     |     ...      |       ...        |   ...         |
|      250457  |     ABBV.O   |     0.5743       |   0.5743      |

Thank you

Babak Fi Foo
  • 678
  • 6
  • 13
  • Missing input data makes this hard to help you with. Have a look at how you can provide a reproducible Pandas example here: https://stackoverflow.com/a/20159305/463796 – w-m Oct 06 '21 at 17:38

0 Answers0