2

I have a pd.Series of lists.

i.e. df = pd.Series([['a', 'b'], ['c', 'd']])

I'd like to convert it to a 2d numpy array.

Doing this: np.array(df.values) doesn't yield the desired result, as the list is considered as an object.

How to get a 2d array?

IsaacLevon
  • 1,680
  • 1
  • 25
  • 65

3 Answers3

3

In your solution only convert values to lists:

print (np.array(df.values.tolist()))
[['a' 'b']
 ['c' 'd']]

Or create DataFrame first:

print (pd.DataFrame(df.values.tolist()).values)
jezrael
  • 729,927
  • 78
  • 1,141
  • 1,090
  • For what it's worth, this solution appears to be slightly faster than the one from @yaseco using a small test dataset. `%%timeit` results: `np.array(df.values.tolist())`: 683 µs ± 20 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) `np.stack(df.values)`: 759 µs ± 50.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) – emigre459 Jul 06 '21 at 19:38
1

Just apply pd.Series:

df.apply(pd.Series).values
koPytok
  • 3,163
  • 1
  • 10
  • 26
1

Okay, I just found np.stack can do that too.

df = pd.Series([['a', 'b'], ['c', 'd']])
np.stack(df.values).shape

results

(2, 2)

IsaacLevon
  • 1,680
  • 1
  • 25
  • 65