-2

enter image description hereI am trying to calculate the Pearson coefficient for all columns in my dataframe but when I try to make a heatmap I return Nan values in rows with zeroes in them. Any suggestions on how to fix it? Here is the screenshot of the code and output below:

#Calculate the correlation coefficients
corr = dfno.corr(method ='pearson') 
#plot it in the next line
corr.round(2).style.background_gradient(cmap='coolwarm')

Pearson Heatmap

Jswojcik
  • 19
  • 4

1 Answers1

0

NaN appears if at least one of your columns is constant values. If a column is a constant value, its standard deviation would be 0 and results in a division by 0, hence NaN in Pearson's correlation. Depending on your application, I think easiest way to deal with them is to replace NaNs with 0 in your heatmap output.

corr.fillna(0)
Ehsan
  • 11,523
  • 2
  • 17
  • 30
  • Yeah makes sense now that you pointed that out. Thanks! – Jswojcik Jun 29 '20 at 21:35
  • 1
    @Jswojcik You are welcome. Please check out https://stackoverflow.com/help/someone-answers on how to accept answers and welcome to SO. – Ehsan Jun 29 '20 at 21:36