0

I have a piece of data that looks like this

    my_data[:5]

returns:

    [{'key': ['Aaliyah', '2', '2016'], 'values': ['10']},
     {'key': ['Aaliyah', '2', '2017'], 'values': ['26']},
     {'key': ['Aaliyah', '2', '2018'], 'values': ['21']},
     {'key': ['Aaliyah', '2', '2019'], 'values': ['26']},
     {'key': ['Aaliyah', '2', '2020'], 'values': ['15']}]

The key represents Name, Gender, and Year. The value is number. I do not manage to generate a data frame with columns name, gender, year, and number.

Can you help me?

I'mahdi
  • 11,310
  • 3
  • 17
  • 23
Greger
  • 11
  • 4
  • Does this answer your question? [JSON to pandas DataFrame](https://stackoverflow.com/questions/21104592/json-to-pandas-dataframe) – gst Oct 18 '21 at 13:04

2 Answers2

0

Here is one way, using a generator:

from itertools import chain
pd.DataFrame.from_records((dict(zip(['name', 'gender', 'year', 'number'],
                                    chain(*e.values())))
                           for e in my_data))

Without itertools:

pd.DataFrame(((E:=list(e.values()))[0]+E[1] for e in my_data),
             columns=['name', 'gender', 'year', 'number'])

output:

      name gender  year number
0  Aaliyah      2  2016     10
1  Aaliyah      2  2017     26
2  Aaliyah      2  2018     21
3  Aaliyah      2  2019     26
4  Aaliyah      2  2020     15
mozway
  • 81,317
  • 8
  • 19
  • 49
0

You can iterate over list of dict. Get all values then use chain to get list of lists and convert this to DataFrame like below:

>>> from itertools import chain

>>> table = [chain.from_iterable(m.values()) for m in my_data]

>>> pd.DataFrame(table, columns=['name', 'gender', 'year', 'number'])
     name   gender  year    number
0   Aaliyah 2       2016    10
1   Aaliyah 2       2017    26
2   Aaliyah 2       2018    21
3   Aaliyah 2       2019    26
4   Aaliyah 2       2020    15



# for more explanation
>>> [list(chain.from_iterable(m.values())) for m in my_data]
[['Aaliyah', '2', '2016', '10'],
 ['Aaliyah', '2', '2017', '26'],
 ['Aaliyah', '2', '2018', '21'],
 ['Aaliyah', '2', '2019', '26'],
 ['Aaliyah', '2', '2020', '15']]
I'mahdi
  • 11,310
  • 3
  • 17
  • 23
  • 1
    If you're using generators, best is not to assign an intermediate list, you're losing the benefit of the generation ;) – mozway Oct 18 '21 at 13:11
  • @mozway thank `MAN` I write first because I need this for explanation – I'mahdi Oct 18 '21 at 13:14