29

If I have two lists

l1 = [ 'A', 'B' ]

l2 = [ 1, 2 ]

what is the most elegant way to get a pandas data frame which looks like:

+-----+-----+-----+
|     | l1  | l2  |
+-----+-----+-----+
|  0  | A   | 1   |
+-----+-----+-----+
|  1  | A   | 2   |
+-----+-----+-----+
|  2  | B   | 1   |
+-----+-----+-----+
|  3  | B   | 2   |
+-----+-----+-----+

Note, the first column is the index.

user2390182
  • 67,685
  • 6
  • 55
  • 77
K.Chen
  • 1,056
  • 1
  • 11
  • 17

3 Answers3

45

use product from itertools:

>>> from itertools import product
>>> pd.DataFrame(list(product(l1, l2)), columns=['l1', 'l2'])
  l1  l2
0  A   1
1  A   2
2  B   1
3  B   2
behzad.nouri
  • 69,003
  • 18
  • 120
  • 118
20

As an alternative you can use pandas' cartesian_product (may be more useful with large numpy arrays):

In [11]: lp1, lp2 = pd.core.reshape.util.cartesian_product([l1, l2])

In [12]: pd.DataFrame(dict(l1=lp1, l2=lp2))
Out[12]:
  l1  l2
0  A   1
1  A   2
2  B   1
3  B   2

This seems a little messy to read in to a DataFrame with the correct orient...

Note: previously cartesian_product was located at pd.core.reshape.util.cartesian_product.

adir abargil
  • 4,784
  • 1
  • 16
  • 23
Andy Hayden
  • 328,850
  • 93
  • 598
  • 514
  • *atm there is a pd.MultiIndex.from_product, not sure how useful DataFrame constructor would be...* – Andy Hayden Sep 03 '14 at 04:45
  • 4
    As of pandas 0.20.2, `cartesian_product()` is in `pd.core.reshape.util`. This solution is faster than using `itertools.product`, and can be made even faster by initializing the dataframe with `np.array().T` of the non-unpacked data instead. – Ken Wei Jul 05 '17 at 09:18
  • This is an elegant solution and works just as easily for 3+ lists. I just used it quickly to find all combinations of 5 lists. Very nice! – Lenwood Oct 30 '18 at 17:03
5

You can also use the sklearn library, which uses a NumPy-based approach:

from sklearn.utils.extmath import cartesian

df = pd.DataFrame(cartesian((L1, L2)))

For more verbose but possibly more efficient variants see Numpy: cartesian product of x and y array points into single array of 2D points.

jpp
  • 147,904
  • 31
  • 244
  • 302