3

I at one point, a long time ago, had two years of graduate econometrics focusing on time series, plus more on micro cross-section techniques. I haven’t made much use of the time-series stuff for a long time, but lately I have been doing so again, focusing on daily- or higher frequency data for the first time. I have been pleased at how much of the stuff I once learned I still retain. But I find there are a whole bunch of issues I never learned anything about, that you really need to understand to use daily data, and I have been spending a great deal of time doing ad-hoc fixes or unguided tours of the literature and not feeling very confident that I am using the best (or even a good) approach, and feeling pretty sure that, even if I get the right answer I am not well-positioned to defend my ad-hoc decisions to a skeptic.

Here are some examples of things you need to deal with when you are looking at prices on a daily frequency or higher that I pretty much never had to deal with in the past, when working with government-generated economic statistics.

  1. Weekends and holidays;
  2. Overnight market activity that seems to be qualitatively different than business-hour activity (e.g. completely different skew and kurtosis) that nonetheless pertains to the same assets;
  3. Truly missing data – holes in the series;
  4. Real outliers (I don’t think I ever treated any number out of the NIPAs, or from the BEA or the Bureau of Labor Statistics, as an outlier. Didn’t even consider it.);
  5. Weeks, months, and years, (and days of each of those), all of which may have significant effects, that don’t align or repeat together except over very long time periods;
  6. Months, and years (leap years) of different lengths (I probably should have recognized different-length months as a problem even back in the nineties, but I don’t think I ever did);
  7. Having data only partially ordered in time, like a daily high and low of price where you know they cam after the open and before the close of market, or of some shorter period in which the data is recorded, without knowing which came first; etc.

And then there are a whole bunch of additional issues that come up when you deal with really high-frequency data that approaches continuous time. I am not even thinking about those issues yet.

Plainly, these are applied questions. The community of people who work with daily or hourly data must have developed a sense of the different approaches that one might take toward each of these issues, and which of them work well or badly in practice. Which ones require careful sound theory, which has been developed by and used in these communities? Which have kludges or quick fixes that everybody knows work even though they can’t really be justified? Which are really open problems?

So here is my question: Is there anything resembling a good book on these issues? Or an online class? Or a set of tutorials? Or any other resource that people have found helpful in learning how to deal with this stuff? The closest I have come so far is Rob Hyndman’s undergraduate text Forecasting: Principles and Practice, but it still leaves a lot of these questions unanswered, or provides an answer that I’m not sure is best, or is more tied than I want to be to a single software implementation. (That said, it’s been a big help).

andrewH
  • 3,117

0 Answers0