0

I have a dataframe and I'm plotting a histogram based on column a.

#creating some random data
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(8,4), columns=list('abcd'))
df['id'] = [1,2,1,4, 1,3,4,5]
df
Out[3]: 
          a         b         c         d  id
0  0.464282  0.538121  0.898136  0.832862   1
1  0.245357  0.905359  0.537300  0.062490   2
2  0.079354  0.052991  0.304458  0.863500   1
3  0.595605  0.752262  0.397301  0.678958   4
4  0.443702  0.611610  0.932731  0.051798   1
5  0.852728  0.823830  0.716792  0.857282   3
6  0.970739  0.282138  0.098921  0.187915   4
7  0.946422  0.027597  0.540050  0.505796   5


sns.displot(data=df, x='a', hue='id', linewidth=0 )

enter image description here

I want to get the id of the points that lie in each bin. This link explains how to get the number of points in each bin but I'm unable to figure out how to relate this to the id column in my dataframe.

Does there exist a way to extract this information using Python?

user42
  • 568
  • 4
  • 16
  • 1
    See e.g. [Binning column with python pandas](https://stackoverflow.com/questions/45273731/binning-column-with-python-pandas/45273750). You can `groupby` the bins to get the rows that belong to each bin. – JohanC Jun 16 '21 at 09:57

0 Answers0