3

I have a list called 'gender', of which I counted all the occurrences of the values with Counter:

gender = ['2',
          'Female,',
          'All Female Group,',
          'All Male Group,',
          'Female,',
          'Couple,',
          'Mixed Group,'....]

gender_count = Counter(gender)
gender_count 
Counter({'2': 1,
     'All Female Group,': 222,
     'All Male Group,': 119,
     'Couple,': 256,
     'Female,': 1738,
     'Male,': 2077,
     'Mixed Group,': 212,
     'NA': 16})

I want to put this dict into a pandas Dataframe. I have used pd.series(Convert Python dict into a dataframe):

s = pd.Series(gender_count, name='gender count')
s.index.name = 'gender'
s.reset_index()

Which gives me the dataframe I want, but I don't know how to save these steps into a pandas DataFrame. I also tried using DataFrame.from_dict()

s2 = pd.DataFrame.from_dict(gender_count, orient='index')

But this creates a dataframe with the categories of gender as the index.

I eventually want to use gender categories and the count for a piechart.

Community
  • 1
  • 1
Lisadk
  • 305
  • 1
  • 6
  • 16

3 Answers3

3

Skip the intermediate step

gender = ['2',
          'Female',
          'All Female Group',
          'All Male Group',
          'Female',
          'Couple',
          'Mixed Group']

pd.value_counts(gender)

Female              2
2                   1
Couple              1
Mixed Group         1
All Female Group    1
All Male Group      1
dtype: int64
piRSquared
  • 265,629
  • 48
  • 427
  • 571
2
In [21]: df = pd.Series(gender_count).rename_axis('gender').reset_index(name='count')

In [22]: df
Out[22]:
              gender  count
0                  2      1
1  All Female Group,    222
2    All Male Group,    119
3            Couple,    256
4            Female,   1738
5              Male,   2077
6       Mixed Group,    212
7                 NA     16
MaxU - stop genocide of UA
  • 191,778
  • 30
  • 340
  • 375
  • I used your code, but this gives me the error message 'str' object is not callable, for 'gender' and 'count'. – Lisadk Mar 29 '17 at 16:53
  • 1
    @Lisadk: You're likely using an older version of pandas. See the output of `pd.__version__`. – root Mar 29 '17 at 18:40
0

what about just

s = pd.DataFrame(gender_count)
ℕʘʘḆḽḘ
  • 17,138
  • 32
  • 109
  • 206
  • Since 'gender_count' is a dict, this does not work (ValueError: If using all scalar values, you must pass an index). EDIT: when you put in index=[0], it gives me the categories of gender as the columns. – Lisadk Mar 29 '17 at 16:39
  • This gives me the categories of gender as the columns. I would like 'gender' and 'count' as the colums. – Lisadk Mar 29 '17 at 16:52