3

I have a csv file with a timestamp given in CAT (Central African Time). When I read it in as a pandas dataframe using:

df = pd.read_csv(path, parse_dates=["timestamp"], dayfirst=True)

I get an error:

C:\Users..\lib\site-packages\dateutil\parser_parser.py:1218: UnknownTimezoneWarning: tzname CAT identified but not understood. Pass tzinfos argument in order to correctly return a timezone-aware datetime. In a future version, this will raise an exception. category=UnknownTimezoneWarning)

which seems to indicate I need to pass a parameter tzinfos, but as far as I could see its not listed as an option for read_csv in the Pandas documentation. I tried both of:

df = pd.read_csv(path, parse_dates=["timestamp"], dayfirst=True, tzinfos={"CAT": "Etc/GMT+2"})
df = pd.read_csv(path, parse_dates=["timestamp"], dayfirst=True, tzinfos= "Etc/GMT+2")

but I keep getting an error:

TypeError: read_csv() got an unexpected keyword argument 'tzinfos'

Now at the moment its just a warning and it still reads it in as timezoneless data points to which I can just add the correct timezone info with: df.timestamp.dt.tz_localize("Etc/GMT+2"), however the fact that the warning says "In a future version, this will raise an exception" makes me think my code will break in the future so I would prefer to fix it now.

I tried googling for a solution but all the results seem to do with general datetime conversions, not reading in a csv (I couldn't figure out how the results translate).

Example of the data

164_user
  • 79
  • 3
  • 1
    could you please add the sample data as text instead of image? copy&paste is better than OCR ;-) – FObersteiner Aug 26 '21 at 16:01
  • 1
    https://stackoverflow.com/questions/18911241/how-to-read-datetime-with-timezone-in-pandas –  Aug 26 '21 at 16:04

2 Answers2

2

tzinfos is an argument for dateutil's parser, which is used by pandas internally to parse the date/time strings. It cannot be supplied to pd.read_csv (or pd.to_datetime) directly, afaik.

Instead, you can read the csv without parsing the dates, import the parser, and apply it with the kwarg, Ex:

import pandas as pd
from dateutil import parser, tz

s = pd.Series(["01-Apr-17 12:00:00 AM CAT"])

# use tzfile Africa/Maputo for CAT:
s.apply(parser.parse, tzinfos={"CAT": tz.gettz("Africa/Maputo")})

0   2017-04-01 00:00:00+02:00
dtype: datetime64[ns, tzfile('/usr/share/zoneinfo/Africa/Maputo')]
FObersteiner
  • 16,957
  • 5
  • 24
  • 56
1

Read the docs: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html

There is no tzinfos parameter in pandas 1.3 pd.read_csv().

Peter
  • 3,992
  • 2
  • 18
  • 25
  • Yeah like I mentioned in the question I saw that it is not listed as a parameter in the docs, I simply tried it based on the error message, and from not knowing what else to do. From MrFuppes answer however I realise that the error is from the parser, not from pandas's read_csv itself. – 164_user Aug 26 '21 at 17:36