0

Recently, I try to find an efficient way to do accumulated sum on a series.

>>> df=pd.DataFrame()
>>> df['a']=[1,3,1,4,2,5,3,8]
>>> df
       a
    0  1
    1  3
    2  1
    3  4
    4  2
    5  5
    6  3
    7  8

The expected output :

df
       a  b
    0  1  1
    1  3  4
    2  1  5
    3  4  9
    4  2  11
    5  5  16
    6  3  19
    7  8  27

Each b[i] equals sum(a[j] for j<=i)

I deal with the problem by

df['b']=df.a
for i in range(df.shape[0]-1):
    df.b.ix[i+1]+=df.b.ix[i] if df.b.ix[i+1] else df.b.ix[i]

It's not concise enough, I want to take off the loop. Here I come for advice.

MaNKuR
  • 2,362
  • 1
  • 18
  • 29
Garvey
  • 1,105
  • 2
  • 12
  • 25

1 Answers1

1
df['b'] = df.a.cumsum()

Reference: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.cumsum.html

John Zwinck
  • 223,042
  • 33
  • 293
  • 407