I have the following data frame:
df = pd.DataFrame({'Name': ['John', 'John', 'John', 'Hans', 'Hans', 'Hans'],
'SA': ['10', np.NaN, '10', np.NaN, np.NaN, '15'],
'TA': ['8', np.NaN, '8', '12', np.NaN, np.NaN]})
Out[5]:
Name SA TA
0 John 10 8
1 John NaN NaN
2 John 10 8
3 Hans NaN 12
4 Hans NaN NaN
5 Hans 15 NaN
My objective sounds simple: if values for a specific column (say SA) for a specific name (e.g. John) are all 'NaN', then do nothing. If at least one of those values contains a real number, then replace all remaining 'NaN' values for that name with the number. Put differently, the desired output should look like this:
Out[7]:
Name SA TA
0 John 10 8
1 John 10 8
2 John 10 8
3 Hans 15 12
4 Hans 15 12
5 Hans 15 12
Obviously, one needs the groupby-command in order for the adjustment to work. But I'm not sure how to set up such a code. Any solutions?