1

Given df

df = pd.DataFrame({'distance': [0,1,2,np.nan,3,4,5,np.nan,np.nan,6]})

   distance
0       0.0
1       1.0
2       2.0
3       NaN
4       3.0
5       4.0
6       5.0
7       NaN
8       NaN
9       6.0

I want to replace the nans with the inbetween mean

Expected output:

   distance
0       0.0
1       1.0
2       2.0
3       2.5
4       3.0
5       4.0
6       5.0
7       5.5
8       5.5
9       6.0

I have seen this_answer but it's for a grouping which isn't my case and I couldn't find anything else.

Kenan
  • 11,783
  • 8
  • 39
  • 49

2 Answers2

2

If you don't want df.interpolate you can compute the mean of the surrounding values manually with df.bfill and df.ffill

(df.ffill() + df.bfill()) / 2

Out:

   distance
0       0.0
1       1.0
2       2.0
3       2.5
4       3.0
5       4.0
6       5.0
7       5.5
8       5.5
9       6.0
Michael Szczesny
  • 4,710
  • 4
  • 13
  • 31
1

How about using linear interpolation?

print(df.distance.interpolate())

0    0.000000
1    1.000000
2    2.000000
3    2.500000
4    3.000000
5    4.000000
6    5.000000
7    5.333333
8    5.666667
9    6.000000
Name: distance, dtype: float64
robertwest
  • 824
  • 7
  • 13