2

I have a dataframe that associates each date with multiple values. The date range is from 02-02 to 04-30.

I have a dataframe with two columns -- 'Date' and 'Score'. The 'Date' entries are timestamps.

 dem_data = {Timestamp('2020-02-02 22:27:00+0000', tz='UTC'): [0.5423],
             Timestamp('2020-02-02 18:52:09+0000', tz='UTC'): [-0.1027],
             Timestamp('2020-02-02 21:26:46+0000', tz='UTC'): [0.4939],
             Timestamp('2020-02-03 18:35:43+0000', tz='UTC'): [0.8074],
             Timestamp('2020-02-03 22:45:00+0000', tz='UTC'): [-0.7845],
             Timestamp('2020-02-03 18:39:47+0000', tz='UTC'): [0.9081],
             Timestamp('2020-02-04 05:43:06+0000', tz='UTC'): [0.8402],
             Timestamp('2020-02-04 19:31:46+0000', tz='UTC'): [0.8316],
             ...}

I converted the Timestamp values to shortened string versions and made these values the indices for the dataframe.
1

Here's the code I wrote to plot the data:

fig_dims = (9, 6)
fig, ax = plt.subplots(figsize=fig_dims)
ax = sns.lineplot(x=dem_data.index, y='Score', data=dem_data, ax = ax)
ax.set_facecolor('white')
freq = int(10)
ax.set_xticklabels(concatenated.iloc[::freq].Date)
xtix = ax.get_xticks()
ax.set_xticks(xtix[::freq])
fig.autofmt_xdate()
plt.tight_layout()
plt.show()

And here's the resulting image.
2

A few things are strange about this.

  1. The x-axis is labeled as 'fake-date', the name of my index column, which includes shortened string versions of the timestamps in the 'dates' column. However, it displays the full timestamp, which I did not want.
  2. The x-axis only displays dates between 02-02 and 02-12.

How can I get the axis to display all dates (and in a way that is legible)?

Zephyr
  • 10,450
  • 29
  • 41
  • 68
jsd191
  • 33
  • 4
  • What is the variable `concatenated`? Why not use `ax.set_xticklabels(dem_data.iloc[::freq].index)` ? – Derek O Jul 20 '20 at 04:27

1 Answers1

1

I recreated a portion of your DataFrame, and just plotted every other row by setting freq = int(2). I formatted the date to not display time (but you can modify it to display whatever part of the date/time you want to keep), and also adjusted the angle of the x-axis labels to be 45 degrees. An angle of 90 degrees can save room but may be harder to read.

I can update my answer when I know where the variable concatenated comes from. For now I'll assume concatenated.iloc[::freq].Date would work similarly to dem_data.iloc[::freq].index, but something is different between the two if concatenated.iloc[::freq].Date only leads to the very beginning dates of your dem_data being plotted

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import seaborn as sns
from datetime import datetime

dem_data = pd.DataFrame({'Score':[0.5423,-0.1027,0.4939,0.8074,-0.7845,0.9081,0.8402,0.8316]})
dem_data.index = [pd.Timestamp('2020-02-02 22:27:00+0000'),
    pd.Timestamp('2020-02-02 18:52:09+0000'),
    pd.Timestamp('2020-02-02 21:26:46+0000'),
    pd.Timestamp('2020-02-03 18:35:43+0000'),
    pd.Timestamp('2020-02-03 22:45:00+0000'),
    pd.Timestamp('2020-02-03 18:39:47+0000'),
    pd.Timestamp('2020-02-04 05:43:06+0000'),
    pd.Timestamp('2020-02-04 19:31:46+0000')]

fig_dims = (9, 6)
fig, ax = plt.subplots(figsize=fig_dims)
ax = sns.lineplot(x=dem_data.index, y='Score', data=dem_data, ax = ax)
ax.set_facecolor('white')
freq = int(2)
ax.set_xticklabels(dem_data.iloc[::freq].index)
xtix = ax.get_xticks()
ax.set_xticks(xtix[::freq])

format_ymd = mdates.DateFormatter('%Y-%m-%d')
ax.xaxis.set_major_formatter(format_ymd)
plt.xticks(rotation=45)

plt.tight_layout()
plt.show()

enter image description here

Derek O
  • 11,124
  • 3
  • 19
  • 35