0

I have the following data frame:

df = pd.DataFrame({'Name': ['John', 'John', 'John', 'Hans', 'Hans', 'Hans'], 
                   'SA': ['10', np.NaN, '10', np.NaN,  np.NaN, '15'], 
                   'TA': ['8', np.NaN, '8', '12', np.NaN, np.NaN]})

Out[5]: 
   Name   SA   TA
0  John   10    8
1  John  NaN  NaN
2  John   10    8
3  Hans  NaN   12
4  Hans  NaN  NaN
5  Hans   15  NaN

My objective sounds simple: if values for a specific column (say SA) for a specific name (e.g. John) are all 'NaN', then do nothing. If at least one of those values contains a real number, then replace all remaining 'NaN' values for that name with the number. Put differently, the desired output should look like this:

Out[7]: 
   Name  SA  TA
0  John  10   8
1  John  10   8
2  John  10   8
3  Hans  15  12
4  Hans  15  12
5  Hans  15  12

Obviously, one needs the groupby-command in order for the adjustment to work. But I'm not sure how to set up such a code. Any solutions?

Ben
  • 89
  • 4

0 Answers0