3

I am able to make histogram in python but I am unable to add density curve , I see many code which are using different ways to add density curve on histogram but I am not sure how to get on my code

I have added density = true but not able to get density curve on histogram

df = pd.DataFrame(np.random.randn(100, 4), columns=list('ABCD'))
X=df['A']

hist, bins = np.histogram(X, bins=10,density=True)
width = 0.7 * (bins[1] - bins[0])
center = (bins[:-1] + bins[1:]) / 2
plt.bar(center, hist, align='center', width=width)
plt.show()
Grayrigel
  • 3,112
  • 5
  • 13
  • 25
Dexter1611
  • 338
  • 3
  • 12
  • 1
    [Take a look at this answer using seaborn](https://stackoverflow.com/a/32803224/3595907) – DrBwts Oct 21 '20 at 16:16
  • 2
    You'll need seaborn's `distplot()` or `histplot()`. The function names and parameters changed a bit in the latest version (0.11). Note that `np.histogram(..., density=True)` means that the histogram will be normalized such that the total area sums to 1, so it can share the y-axis with a kdeplot. – JohanC Oct 21 '20 at 16:22

2 Answers2

3

Pandas also has kde plot:

hist, bins = np.histogram(X, bins=10,density=True)
width = 0.7 * (bins[1] - bins[0])
center = (bins[:-1] + bins[1:]) / 2
plt.bar(center, hist, align='center', width=width, zorder=1)

# density plot
df['A'].plot.kde(zorder=2, color='C1')
plt.show()

Output:

enter image description here

Quang Hoang
  • 131,600
  • 10
  • 43
  • 63
3

Here is an approach using distplot method of seaborn. Also, mentioned in the comments:

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

df = pd.DataFrame(np.random.randn(100, 4), columns=list('ABCD'))
X = df['A']

sns.distplot(X, kde=True, bins=20, hist=True)
plt.show()

enter image description here

However, distplot will be removed in a future version of seaborn. Therefore, alternatives are to use histplot and displot.

sns.histplot

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

df = pd.DataFrame(np.random.randn(100, 4), columns=list('ABCD'))
X = df['A']

sns.histplot(X, kde=True, bins=20)
plt.show()

enter image description here

sns.displot

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

df = pd.DataFrame(np.random.randn(100, 4), columns=list('ABCD'))
X = df['A']

sns.displot(X, kde=True, bins=20)
plt.show()

enter image description here

Grayrigel
  • 3,112
  • 5
  • 13
  • 25
  • How to adjust the density curve so that it's value is not the top of left edge of the bin rectangle, but its center (and the maximum of kde and bin coincide)? – mins Dec 02 '20 at 10:20