4

Let's say I have the following series:

0    A
1    B
2    C
dtype: object

0    1
1    2
2    3
3    4
dtype: int64

How can I merge them to create an empty dataframe with every possible combination of values, like this:

   letter  number
0       A       1
1       A       2
2       A       3
3       A       4
4       B       1
5       B       2
6       B       3
7       B       4
8       C       1
9       C       2
10      C       3
11      C       4
bill
  • 101
  • 1
  • 2
  • 4
  • If you have a lot of data and need speed, check out this answer.... https://stackoverflow.com/questions/1208118/using-numpy-to-build-an-array-of-all-combinations-of-two-arrays – run-out Feb 18 '19 at 07:59

3 Answers3

8

Assuming the 2 series are s and s1, use itertools.product() which gives a cartesian product of input iterables :

import itertools
df = pd.DataFrame(list(itertools.product(s,s1)),columns=['letter','number'])
print(df)

    letter  number
0       A       1
1       A       2
2       A       3
3       A       4
4       B       1
5       B       2
6       B       3
7       B       4
8       C       1
9       C       2
10      C       3
11      C       4
anky
  • 71,373
  • 8
  • 36
  • 64
2

If you have 2 Series s1 and s2. you can do this:

pd.DataFrame(index=s1,columns=s2).unstack().reset_index()[["s1","s2"]]

It will give you the follow

   s1  s2
0   A   1
1   B   1
2   C   1
3   A   2
4   B   2
5   C   2
6   A   3
7   B   3
8   C   3
9   A   4
10  B   4
11  C   4
Dawei
  • 976
  • 10
  • 18
  • 2
    Just to add that this method requires that the series be named first, and the subsequent slicing ```[["s1", "s2"]]``` be done using the names of the series – kerwei Feb 18 '19 at 07:11
0

As of Pandas 1.2.0, there is a how='cross' option in pandas.merge() that produces the Cartesian product of the columns.

import pandas as pd

letters = pd.DataFrame({'letter': ['A','B','C']})
numbers = pd.DataFrame({'number': [1,2,3,4]})

together = pd.merge(letters, numbers, how = 'cross')
   letter  number
0       A       1
1       A       2
2       A       3
3       A       4
4       B       1
5       B       2
6       B       3
7       B       4
8       C       1
9       C       2
10      C       3
11      C       4

As an additional bonus, this function makes it easy to do so with more than one column.

letters = pd.DataFrame({'letterA': ['A','B','C'],
                        'letterB': ['D','D','E']})
numbers = pd.DataFrame({'number': [1,2,3,4]})

together = pd.merge(letters, numbers, how = 'cross')
   letterA letterB  number
0        A       D       1
1        A       D       2
2        A       D       3
3        A       D       4
4        B       D       1
5        B       D       2
6        B       D       3
7        B       D       4
8        C       E       1
9        C       E       2
10       C       E       3
11       C       E       4
NickCHK
  • 616
  • 5
  • 13