1

I have a dataframe like:

data = {'year': [2020, 2020, 2021, 2021], 'week': [52, 53, 1, 2]}
df = pd.DataFrame(data=data)

   year  week
0  2020    52
1  2020    53
2  2021     1
3  2021     2

and I would like to get for each line the Monday (date) of that period so something like:

   year  week  period
0  2020    52  2020-12-21
1  2020    53  2020-12-28
2  2021     1  2021-01-04
3  2021     2  2021-01-11

What is the correct way to do this in pandas?

Benjamin
  • 2,938
  • 3
  • 21
  • 43

2 Answers2

1

The right way to do this is to use ISO date format pattern:

df["period"] = pd.to_datetime(
    df.year.astype(str) + '-W' + df.week.astype(str) + '-1',
    format='%G-W%V-%u')\
    .dt.strftime('%Y-%m-%d')

and not

df["period"] = pd.to_datetime(
    df.year.astype(str) + '-W' + df.week.astype(str) + '-1',
    format='%Y-W%W-%w')\
    .dt.strftime('%Y-%m-%d')

Because of ISO week number (cf comments)

Benjamin
  • 2,938
  • 3
  • 21
  • 43
0

You can try with:

data = {'year': [2020, 2020, 2021, 2021], 'week': [52, 53, 1, 2]}
df = pd.DataFrame(data=data)

df['date combined']=df['year'].astype(str)+'-'+df['week'].astype(str)+ '-1'
df['date of Monday']=pd.to_datetime(df['date combined'], format='%Y-%W-%w')

print(df)

result:

   year  week date combined       date
0  2020    52     2020-52-1 2020-12-28
1  2020    53     2020-53-1 2021-01-04
2  2021     1      2021-1-1 2021-01-04
3  2021     2      2021-2-1 2021-01-11
Renaud
  • 2,334
  • 2
  • 9
  • 23