13

I create a simple pandas dataframe with some random values and a DatetimeIndex like so:

import pandas as pd
from numpy.random import randint
import datetime as dt
import matplotlib.pyplot as plt

# create a random dataframe with datetimeindex
dateRange = pd.date_range('1/1/2011', '3/30/2011', freq='D')
randomInts = randint(1, 50, len(dateRange))
df = pd.DataFrame({'RandomValues' : randomInts}, index=dateRange)

Then I plot it in two different ways:

# plot with pandas own matplotlib wrapper
df.plot()

# plot directly with matplotlib pyplot
plt.plot(df.index, df.RandomValues)

plt.show()

(Do not use both statements at the same time as they plot on the same figure.)

I use Python 3.4 64bit and matplotlib 1.4. With pandas 0.14, both statements give me the expected plot (they use slightly different formatting of the x-axis which is okay; note that data is random so the plots do not look the same): pandas 0.14: pandas plot

pandas 0.14: matplotlib plot

However, when using pandas 0.15, the pandas plot looks alright but the matplotlib plot has some strange tick format on the x-axis:

pandas 0.15: pandas plot

pandas 0.15: matplotlib plot

Is there any good reason for this behaviour and why it has changed from pandas 0.14 to 0.15?

Dirk
  • 8,143
  • 15
  • 66
  • 95

2 Answers2

24

Note that this bug was fixed in pandas 0.15.1 (https://github.com/pandas-dev/pandas/pull/8693), and plt.plot(df.index, df.RandomValues) now just works again.


The reason for this change in behaviour is that starting from 0.15, the pandas Index object is no longer a numpy ndarray subclass. But the real reason is that matplotlib does not support the datetime64 dtype.

As a workaround, in the case you want to use the matplotlib plot function, you can convert the index to python datetime's using to_pydatetime:

plt.plot(df.index.to_pydatetime(), df.RandomValues)

More in detail explanation:

Because Index is no longer a ndarray subclass, matplotlib will convert the index to a numpy array with datetime64 dtype (while before, it retained the Index object, of which scalars are returned as Timestamp values, a subclass of datetime.datetime, which matplotlib can handle). In the plot function, it calls np.atleast_1d() on the input which now returns a datetime64 array, which matplotlib handles as integers.

I opened an issue about this (as this gets possibly a lot of use): https://github.com/pydata/pandas/issues/8614

joris
  • 121,165
  • 35
  • 238
  • 198
  • Thanks for your effort opening an issue for this! The workaround is fine, letting me use pandas 0.15 without having to change too much :) – Dirk Oct 23 '14 at 17:36
  • I'm facing the same issue with matplotlib 2.1.0 and pandas 0.21.0. The workaround with `to_pydatetime` still works. – Tulio Casagrande Dec 06 '17 at 10:09
  • Yes, that will be fixed in the upcoming 0.21.1 release (see http://pandas-docs.github.io/pandas-docs-travis/whatsnew.html#restore-matplotlib-datetime-converter-registration) – joris Dec 08 '17 at 16:17
2

With matplotlib 1.5.0 this 'just works':

import pandas as pd
from numpy.random import randint
import datetime as dt
import matplotlib.pyplot as plt

# create a random dataframe with datetimeindex
dateRange = pd.date_range('1/1/2011', '3/30/2011', freq='D')
randomInts = randint(1, 50, len(dateRange))
df = pd.DataFrame({'RandomValues' : randomInts}, index=dateRange)

fig, ax = plt.subplots()
ax.plot('RandomValues', data=df)

demo image

tacaswell
  • 79,602
  • 19
  • 200
  • 189