5

I have an array of strings, for example

import numpy as np
foo = np.array( [b'2014-04-05', b'2014-04-06', b'2014-04-07'] )

To check for the datatype of the array, I print it with

print( foo.dtype )

which results in |S10. Obviously, it consists of strings of length 10. I want to convert it into numpy's datetime64 type.

More precisely, I want to change the datatype of the array without looping through a for-loop and copying it element-wise into a new array (the real array is actually very large). Naive as I am, I thought the following might work

[ np.datetime64(x) for x in foo ]

Spoiler: it does not. Printing the datatype of the array results in the same output as before (i.e. |S10).

Is there any memory efficient way to convert the datatype of the existing array without the necessity of copying everything to a new array?

Alf
  • 1,551
  • 3
  • 26
  • 44

1 Answers1

6

Use .astype, with copy=False to avoid creating a copy:

foo = np.array( [b'2014-04-05', b'2014-04-06', b'2014-04-07'] )

foo = foo.astype('datetime64',copy=False)

>>> foo
array(['2014-04-05', '2014-04-06', '2014-04-07'], dtype='datetime64[D]')
sacuL
  • 45,929
  • 8
  • 75
  • 99