-1

I really need help with this one. My previous post was very bad and unclear - I'm sorry - I wish I could delete but hopefully this one will be better.

I need to calculate the age based off of a date (see ANALYZE section and FINAL OUTCOME SECTION).

ORIGINAL DATA SET

"JOLIE", 09091959,02051983
"PORTMAN",02111979,01272002
"MOORE", 01281975,01182009
"BEST", 04081973,07022008
"MONROE", 04161957,11231979

LOAD DATA

from pandas import DataFrame, read_csv
import matplotlib.pyplot as plt
import pandas as pd

columns = ['lname','dob','scd_csr_mdy']

raw_data = pd.read_csv(r'C:\Users\davidlopez\Desktop\Folders\Standard Reports\HR Reports\eeprofil  \eeprofil.txt',` 
                       names=columns, parse_dates = ['dob','scd_csr_mdy'})

df1 = raw_data

In [1]: df1
Out [1]:

         lname          dob          scd_csr_mdy
    0    JOLIE          09091959     02051983
    1    PORTMAN        02111979     01272002
    2    MOORE          01281975     01182009
    3    BEST           04081973     07022008
    4    MONROE         04161957     11231979

ANALYZE

I tried doing the following but received an error:

now = datetime.now()
df1['age'] = now - df1['dob']

But I received the the error:

TypeError:  unsported operant type(S) for -: 'datetime.datetime' and 'str'

FINAL OUTCOME

     lname          dob          scd_csr_mdy    DOB_AGE     SCD_AGE
0    JOLIE          09091959     02051983       55          32
1    PORTMAN        02111979     01272002       36          13
2    MOORE          01281975     01182009       40          6
3    BEST           04081973     07022008       42          6
4    MONROE         04161957     11231979       58          35

Any suggestions.....?

hitzg
  • 11,442
  • 47
  • 53
Dave
  • 6,127
  • 6
  • 23
  • 28

3 Answers3

3

Convert the dob column from string to a datetime object

df1['dob'] = pd.to_datetime(df1['dob'])
now = datetime.now()    
df1['age'] = now - df1['dob']
user308827
  • 21,018
  • 70
  • 229
  • 377
  • This gave me the same error TypeError: unsupported operand type(s) for -: 'datetime.datetime' and 'str' – Dave Nov 12 '14 at 22:43
  • can you check pandas version using this: pd.__version__ – user308827 Nov 12 '14 at 23:49
  • pd.__version__ 0.13.1 – Dave Nov 13 '14 at 00:35
  • hmm, not sure what is happening. the code does seem to be working at my end. It seems like it has to do something with the csv. When I read in your dataframe using pd.read_clipboard() i.e. by copying the dataframe first, and then running the command; the datetime stuff does work for me – user308827 Nov 13 '14 at 02:17
1

Convert string to datetime with format

df1['age'] = now - datetime.strptime(df1['dob'], "%m%d%Y")
1

if there's not too many entries, you can just do something like:

df['dob'] = df.dob.apply(lambda d: pd.to_datetime(d[-4:] + d[:4]))
now - df.dob
acushner
  • 8,972
  • 1
  • 32
  • 32