A lot of examples are provided on internet to fit a probability density function with scipy on a given vector X and the frequency of each of these components
In my case I would like to fit a probability function not only on 1 variable but rather on 2 variables X and Y. However, I don't understand how to use scipy function to do such task.
My problem: I have information about 500 people between 15 and 35 years old. In this population some of them are sick (53) and the probability is different as function of their age categorie (discretized every two years)
import numpy as np
import matplotlib.pyplot as plt
#people categories (step 2)
categories=np.array([15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35])
midCategories=(categories[:-1]+categories[1:])/2.0
#people repartition
peopleByCat=np.array([49, 52, 45, 49, 54, 52, 50, 48, 50, 51])
peopleSick=np.array([5,9, 11, 12, 8, 4, 2, 1, 0, 1])
#probability
probaByCat=peopleSick/peopleByCat
From these data I can get easily such distribution probability plot
fig = plt.figure();ax=fig.add_subplot(111);ax.set_xlabel("Age categorie");ax.set_ylabel("People by cat");
ax.bar(midCategories,peopleByCat,width=categories[2]-categories[1],color='g')
fig = plt.figure();ax=fig.add_subplot(111);ax.set_xlabel("Age categorie");ax.set_ylabel("Sick people by cat");
ax.bar(midCategories,probaByCat,width=categories[2]-categories[1],color='g')
However I don't understand how can I use scipy.stats to fit a probability density function on these X-Y data
Thanks for your help