0

The organization is the index and I need to show a horizontal bar plot showing the total number of attempts of experiments conducted by each organization, with the most at the top and the fewest at the bottom.

The data frame are as follows : Organization DateTime Attempts a .... Failed b .... Success b .... Failed a .... Partial Success a .... Success b .... Partial Success

I got the idea that the attempts is categorical data and it has some count_values() function but just don't know how to code it?

sampledataset

Sunderam Dubey
  • 2,294
  • 9
  • 12
  • 22
  • Please provide an example dataset. – mozway Aug 04 '21 at 05:24
  • 1
    Dear Anne, as often indicated here, you should ideally post a [minimal reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) so that this community can try and help you by working on an operational code. You'd get much more prompt help and guidance towards your success. Good luck! – massimopinto Aug 04 '21 at 06:09
  • 1
    An image is not really helpful. As @massimopinto wrote, one should be able to reproduce the problem. Please also read [**How to make good reproducible pandas examples**](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – mozway Aug 04 '21 at 06:23

1 Answers1

2

Let's assume this input:

np.random.seed(0)
df = pd.DataFrame({'company': np.random.choice(['A', 'B', 'C', 'D'], 50),
                   'Location': 'non-relevant',
                   'Outcome': np.random.choice(['Success', 'Failure'], 50),
                  }).set_index('company')
df.head()
             Location  Outcome
company                       
A        non-relevant  Success
D        non-relevant  Failure
B        non-relevant  Failure
A        non-relevant  Failure
D        non-relevant  Failure
...

Calculate the counts per group and sort:

>>> df2 = df.groupby('company')['Outcome'].count().sort_values()
>>> df2
company
C     8
A    12
B    12
D    18
Name: Outcome, dtype: int64

plot:

df2.plot.barh()

barh

And here how to calculate with failure/successes:

df2 = df.groupby('company')['Outcome'].value_counts().unstack('Outcome')
df2 = df2.loc[df2.sum(axis=1).sort_values().index]
df2.plot.barh(stacked='True')

barh stacked

mozway
  • 81,317
  • 8
  • 19
  • 49
  • 1
    1. You can edit your question and add the content of df2 + graph. 2. Do you want to ignore the "partial" values or consider them as success/failure? (e.g. partial success -> success) – mozway Aug 05 '21 at 05:35