2

I have a dictionary that looks like that:

dic = {'a': {'b': [1,2], 'c': [3,4]}, 'A': {'B': [10,20], 'C': [30, 40]}}

I would like to get a 2 dim dataframe with 3 columns that looks like that:

'a' 'b'  1  
'a' 'b'  2  
'a' 'c'  3  
'a' 'c'  4  
'A' 'B'  10  
'A' 'B'  20  
'A' 'C'  30  
'A' 'C'  40  
Junkrat
  • 2,754
  • 3
  • 19
  • 39
user25640
  • 215
  • 1
  • 10
  • @Xilpex the question is specifically how to make `pandas` do this (which is why the term `dataframe` is being used and the question is tagged with `pandas`). – Karl Knechtel Apr 20 '20 at 19:47

3 Answers3

6

You can try this:

s=pd.DataFrame(d).stack().explode().reset_index()
  level_0 level_1   0
0       b       a   1
1       b       a   2
2       c       a   3
3       c       a   4
4       B       A  10
5       B       A  20
6       C       A  30
7       C       A  40
halfer
  • 19,471
  • 17
  • 87
  • 173
BENY
  • 296,997
  • 19
  • 147
  • 204
  • Thank you, that seems like a very neat solution. Unfortunately explode only works for version 0.25 and up, and we are using 0.24.2. Is there a nice way to do this in older versions? – user25640 Apr 21 '20 at 07:08
  • 1
    @user25640 my self-def function https://stackoverflow.com/questions/53218931/how-to-unnest-explode-a-column-in-a-pandas-dataframe/53218939#53218939 – BENY Apr 21 '20 at 13:05
1

Using list comprehension:

import pandas as pd

dic = {'a': {'b': [1,2], 'c': [3,4]}, 'A': {'B': [10,20], 'C': [30, 40]}}

data = [
    (val_1, val_2, val_3)
    for val_1, nest_dic in dic.items()
    for val_2, nest_list in nest_dic.items()
    for val_3 in nest_list
]
df = pd.DataFrame(data)

print(df)
# Output:
#    0  1   2
# 0  a  b   1
# 1  a  b   2
# 2  a  c   3
# 3  a  c   4
# 4  A  B  10
# 5  A  B  20
# 6  A  C  30
# 7  A  C  40
Xukrao
  • 6,984
  • 4
  • 25
  • 50
1

Like this maybe:

In [1845]: pd.concat({k: pd.DataFrame(v).T for k, v in dic.items()},axis=0).reset_index()                                                                                                                   
Out[1845]: 
  level_0 level_1   0   1
0       a       b   1   2
1       a       c   3   4
2       A       B  10  20
3       A       C  30  40
Mayank Porwal
  • 31,737
  • 7
  • 30
  • 50