7

I've decided to give seaborn version 0.11.0 a go! Playing around with the displot function which will replace distplot, as I understand it. I'm just trying to figure out how to plot a gaussian fit on to a histogram. Here's some example code.

import seaborn as sns
import numpy as np
x = np.random.normal(size=500) * 0.1

With distplot I could do:

sns.distplot(x, kde=False, fit=norm)

enter image description here

But how to go about it in displot or histplot?

UserR6
  • 423
  • 5
  • 14

3 Answers3

4

Sorry I am late to the party. Just check if this will meet your requirement.

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

data = np.random.normal(size=500) * 0.1
mu, std = norm.fit(data)

# Plot the histogram.
plt.hist(data, bins=25, density=True, alpha=0.6, color='g')

# Plot the PDF.
xmin, xmax = plt.xlim()
x = np.linspace(xmin, xmax, 100)
p = norm.pdf(x, mu, std)
plt.plot(x, p, 'k', linewidth=2)
plt.show()

enter image description here

Regi Mathew
  • 2,185
  • 3
  • 20
  • 34
3

I really miss the fit parameter too. It doesn't appear they replaced that functionality when they deprecated the distplot function. Until they plug that hole, I created a short function to add the normal distribution overlay to my histplot. I just paste the function at the top of a file along with the imports, and then I just have to add one line to add the overlay when I want it.

import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
from scipy import stats

def normal(mean, std, color="black"):
    x = np.linspace(mean-4*std, mean+4*std, 200)
    p = stats.norm.pdf(x, mean, std)
    z = plt.plot(x, p, color, linewidth=2)

data = np.random.normal(size=500) * 0.1    
ax = sns.histplot(x=data, stat="density")
normal(data.mean(), data.std())

enter image description here

If you would rather use stat="probability" instead of stat="density", you can normalize the fit curve with something like this:

def normal(mean, std, histmax=False, color="black"):
    x = np.linspace(mean-4*std, mean+4*std, 200)
    p = stats.norm.pdf(x, mean, std)
    if histmax:
        p = p*histmax/max(p)
    z = plt.plot(x, p, color, linewidth=2)

data = np.random.normal(size=500) * 0.1    
ax = sns.histplot(x=data, stat="probability")
normal(data.mean(), data.std(), histmax=ax.get_ylim()[1])
Trenton McKinney
  • 43,885
  • 25
  • 111
  • 113
ohtotasche
  • 338
  • 1
  • 5
1

So far the closest I've come to is:

sns.histplot(x,stat="probability", bins=30, kde=True, kde_kws={"bw_adjust":3})

But I think this just increases the smoothening of the plotted kde, which isn't exactly what I'm going for :'(

UserR6
  • 423
  • 5
  • 14
  • ...and what is it what you are going for? I don't think that this is very clear neither from the OP nor from this "answer" – mikuszefski Nov 02 '20 at 06:44
  • I want to plot a gaussian / normal distribution fitting curve to my data. The 'answer' posted, uses seaborn's kde function to plot the kde. – UserR6 Nov 03 '20 at 08:15
  • You probably have to make an additional `lineplot` and [overlay](https://stackoverflow.com/q/32899463/803359) – mikuszefski Nov 03 '20 at 09:09