21

I'm trying to create a bar chart in seaborn that displays values for two variables(Weight, Variance) for each row (Factor) in my data frame. Here is what my data looks like:

    Factor    Weight  Variance
    Growth    10%      0.15
    Value     20%      0.35

Here is my code:

    fig=plt.figure(figsize=(10,10))
    ax1=fig.add_subplot(221)
    sns.barplot(x=df.index, y=df[['Weight', 'Variance']], ax=ax1)

The above throws back an error every time that I can't debug. What I am trying to achieve is have one plot, that shows two colored bars for each Factor; weight in one color (ex: red) and variance in another color (ex: blue).

Anyone have suggestions or potential workarounds?

Thanks

Vikram Josyula
  • 1,203
  • 4
  • 12
  • 15

1 Answers1

44

Aside from cleaning up your data into a tidy format, you need to reformat the text data (percentages) into numeric data types. Since that has nothing to do with barplots, I'll assume you can take care of that on your own and focus on the plotting and data structures instead:

df = pandas.DataFrame({
    'Factor': ['Growth', 'Value'],
    'Weight': [0.10, 0.20],
    'Variance': [0.15, 0.35]
})
fig, ax1 = pyplot.subplots(figsize=(10, 10))
tidy = df.melt(id_vars='Factor').rename(columns=str.title)
seaborn.barplot(x='Factor', y='Value', hue='Variable', data=tidy, ax=ax1)
seaborn.despine(fig)

enter image description here

Paul H
  • 59,172
  • 18
  • 144
  • 130
  • Nice. If you could show how the tidy dataframe looks like, I guess it would help! – jrjc Dec 05 '16 at 16:13
  • @jrjc I construct it in my code. It's easy enough for the OP to print it out. – Paul H Dec 05 '16 at 16:22
  • @PaulH thanks that helps. I was unaware I had to use reset_index command to unpivot the data. And yes, the data was in numeric form all along, it was just human error on my part when typing it into the dialogue box. – Vikram Josyula Dec 07 '16 at 13:44
  • I'm trying to use this example for a similar plotting exercise. However, when I run the exact code shown above with one small change, I removed the trailing parenthesis from this line: fig, ax1 = pyplot.subplotsfigsize=(10, 10)), I get the following error: AttributeError: 'int' object has no attribute 'bar'. – ewilan May 03 '17 at 15:09
  • I suppose there must be something missing because leaving the trailing parenthesis in place results in this error: `fig, ax1 = plt.subplotsfigsize=(10, 10)) ^ SyntaxError: invalid syntax` The interpreter complains about the trailing parenthesis. I'm running this with Python v2.7 in Jupyter Notebook. – ewilan May 03 '17 at 16:34
  • @ewilan Oh I see the opening paren is missing: `fig, ax1 = pyplot.subplots(figsize=(10, 10))` – Paul H May 03 '17 at 17:42
  • Conceptual question here... How do we know his data isn't in long format already? As a newb, it's pretty unclear to me what are variables and what are realizations in his data. My understanding is that variables are columns and samples are rows for "long" format. Or does long literally mean take the longest axis and make it into rows. I.e. if there are more variables than there are samples, then variables now belong as columns? After that, I'll have to go take a look at what melt does... – rocksNwaves Mar 10 '20 at 18:06
  • 1
    @rocksNwaves We know his data isn't long because of the OP's goals and issues they were encountering. In other words, if the column names are contain information about what the "value" is, and melting the dataframe solves the problem, then the data isn't "long" – Paul H Mar 15 '20 at 18:22
  • `tidy = df.melt(id_vars='Factor').rename(columns=str.title)` Can you please explain this part of your code? what does it mean `columns = str.title`? – mr.sanatbek Jul 28 '21 at 05:43
  • 1
    @mr.sanatbek [`str.title`](https://docs.python.org/3/library/stdtypes.html#str.title) is a built-in python method to capitalize strings, e.g. `'abc'.title()` returns `'Abc'`. [`df.rename`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.rename.html) accepts not only dicts (map keys to values) but also functions (apply function to all elements), so `df.rename(columns=str.title)` applies `str.title` to all column names, i.e. capitalizes them. – tdy Mar 04 '22 at 21:15