0

I am using the benedict python library to parse a .xml file (sample below):

data_source = """
<?xml version="1.0" encoding="utf-8"?>
<RunInfo Version="5">
   <Run Id="210910_A00154_0856_BH2TTNDMXY" Number="856">
        <Date>9/10/2021 3:08:02 PM</Date>
   </Run>
</RunInfo>"""

Eventually what I want to parse is the time in the format of timestamp, but without the date i.e. only 3:08:02 PM

Given that

type(data['RunInfo']['Run']['Date']) results in str

I did pd.to_datetime(data['RunInfo']['Run']['Date'])

yet the date is there for obvious reasons.

So I though about slicing only the part I want to parse (3:08:02 PM), then I would convert it to timeStamp format, with pd.to_datetime(data['RunInfo']['Run']['Date'][-10:], format="%H:%M:%S")

But what happened is that pd.to_dateTime() still outputs a date, now a random one, which is worse.

Does anyone know how can I parse only the time from the original .xml file?

U12-Forward
  • 65,118
  • 12
  • 70
  • 89
BCArg
  • 1,862
  • 2
  • 16
  • 34

1 Answers1

1

We can convert this 9/10/2021 3:08:02 PM in DataTime format like so :

>>> df['timestamp'] = pd.to_datetime(df['timestamp'])

Then, to extract the time :

>>> df['timestamp'].dt.strftime("%I:%M:%S %p")
tlentali
  • 3,250
  • 2
  • 11
  • 20