1

I'm trying to capture heteroskedasticity in the returns of a price time series using a GARCH model.

A basic intuition suggests that I should fit the GARCH model on log-returns: indeed, if the price is divided by $2$ at a certain point in time, it'd give a return of $-0.5$. If it is multiplied by $2$, it gives a return of $1$. So we'd have different amplitude of return for a price move that is actually of the same amplitude because prices are in an exponential scale. If we take log-return, "divided by $2$" gives a log-return of $\log(0.5)\approx-0.3$ and "multiplied by $2$" to gives a log-return of $\log(2)\approx0.3$ : we're good, they are the same in absolute value.

However, after trying the GARCH on log-returns (i.e., the log of the gross return), it appears that log-returns remove a lot of the heteroskedasticity from the actual returns, leading the GARCH not to distinguish clearly between periods of high activity and period of low activity.

To sum up, if I use simple returns, the GARCH distinguish clearly periods of high volatility, but the same price move has a different amplitude depending on if it goes up or down, which biases the estimation of the variance in some way.

On the other hand, if I use log-returns, I don't have the "bias" of the exponential scale, but the result has less heteroskedasticity, which is not good for my strategy since I scale positions depending on volatility.

What is usually used in practice to forecast volatility? Is it more appropriate, in general, to fit a GARCH on returns or on log-returns to estimate volatility?

Richard Hardy
  • 3,146
  • 1
  • 17
  • 30
Jerem Lachkar
  • 220
  • 1
  • 8
  • 1
    Hi: I would look at one of the early papers on arch or garch ( mid to late 80's ) and see what is used and hopefully it's discussed. It kind of makes sense that you would reduce volatility by using log returns because the log transformation is supposed to do that in general not just in return settings. – mark leeds Jun 28 '23 at 14:52
  • Here's one that I remember being pretty good. Hopefully it discusses what you talked about. http://kroner.com/attachments/AcademicPapers/Survey%20(English).pdf – mark leeds Jun 28 '23 at 14:58
  • One issue with not using log returns is that the GARCH assumes a conditional normal distribution (or similar unbounded distribution) for returns. But relative returns below -100% are impossible and using log-returns is then more consistent. But it might not always matter in applications. – fes Jun 28 '23 at 14:58
  • GARCH is not about the type of conditional heteroskedasticity (CH) that can be reduced using a logarithmic transformation. GARCH is about autoregressive CH while the logarithmic transformation can fix the case when variance increases together with the level of the variable. GARCH can be used for both kinds of returns: percentage ones and logarithmic ones. – Richard Hardy Jun 28 '23 at 15:02
  • @markleeds, I searched the document for relevant keywords and did not find anything about log-returns, except for one reference which cannot be generalized. Since log-returns are so popular in financial modelling, I suppose the paper simply does not pay attention to the choice. Otherwise, log-returns would definitely have been mentioned in the discussion. – Richard Hardy Jun 28 '23 at 15:06
  • Hi Richard: Thanks for looking at the paper. I'm not sure what you mean in your comment that starts with GARCH. I would expect the parameters of the ARCH or GARCH model will change ( not sure how much ? ) depending on what definition for returns one uses. But I would expect the predictions would be quite close. I'm not clear to me what you mean as far as the log transformation ? It's known to reduce the variance and I assumed he meant that $log(y_t)$ had less variability than $y_t$. But maybe Jerem was referring to $log(y_t) - log(y_{t-1}$ so the returns themselves. Jerem: Could you clarify ? – mark leeds Jun 29 '23 at 04:37
  • Also, I'm sorry to waste your rime as far as the paper not being helpful. If they don't address the issue, then my best guess is that it probably makes very little difference in parameter estimation also. Jerem: Given what Richard found ( actually what he didn't find ), I'm confident that whether you use log returns or returns isn't going to effect model parameters in any serious manner. But it would be interesting to test that hypothesis empirically. – mark leeds Jun 29 '23 at 04:40
  • @markleeds, regarding testing that hypothesis empirically, if I understand the OP correctly, GARCH results for simple vs. log-returns differ noticeably, thus the question. – Richard Hardy Jun 29 '23 at 05:48
  • Thank you fo your comments. @RichardHardy yes, precisely. Maybe that the CH found from simple returns have a part that is only due to the exponential scale. This would mean that I should use log-return to account only for the "real" CH of the return themselves, removing the additional CH created by the exp scale ? – Jerem Lachkar Jun 29 '23 at 11:36
  • @markleeds Yes I'm referring to log return $log(y_t)-log(y_{t-1})$ , not log-prices – Jerem Lachkar Jun 29 '23 at 11:36
  • Either type of returns (simple or log) has a sensible subject-matter interpretation, so everything is "real" here. I would use whichever is easier to interpret and to model. After all, if you have a model for simple returns, you can back out a model for prices, then log-prices and then log-returns. It is enough to model one of these, and then models for the other ones are implied. – Richard Hardy Jun 29 '23 at 13:42
  • Jerem: Assuming that the resulting model parameters differ significantly, did you check if predictions change also ? – mark leeds Jun 29 '23 at 16:40

3 Answers3

5

What is usually used in practice to forecast volatility?

I believe it is log-returns.

Is it more appropriate, in general, to fit a GARCH on returns or on log-returns to estimate volatility?

The general mathematical specification of the model does not restrict the use of the model to a single type of returns. However, the distribution usually assumed for standardized residuals is typically defined on the real line, yielding nonzero density for values under -100% which are impossible for simple returns. Thus, log-returns are more natural and I would start with them. However, values under -100% are so far down in the left tail that I think for all practical purposes they can be ignored.

I would use whichever type of return is easier to interpret and to model (where you get a better fit and especially out-of-sample forecasts). After all, if you have a model for simple returns, you can back out a model for prices, then log-prices and then log-returns, and the other way around. It is enough to model one of these, and then models for the other ones are implied.

it appears that log-returns remove a lot of the heteroskedasticity from the actual returns, leading the GARCH not to distinguish clearly between periods of high activity and period of low activity.

GARCH is not about the type of conditional heteroskedasticity (CH) that can be reduced using a logarithmic transformation. GARCH is about autoregressive CH while the logarithmic transformation can fix the case when variance increases together with the level of the variable.

...which biases the estimation of the variance in some way.

I do not think it biases the estimation of the target quantity, but you may find that the target quantity does not measure what you are interested in measuring. But again, if you have a model for any particular one-to-one transformation of prices, you can back out a model for any other one-to-one transformation, so you can derive all kinds of quantities from a model for one particular transformation such as log-returns or simple returns.

Richard Hardy
  • 3,146
  • 1
  • 17
  • 30
3

I would not use anything but log returns in finance. The reasons logs are used so frequently are summarised here. Commonly used textbooks like Ruey Tsay, Analysis of Financial Time Series always define (G)ARCH with log returns. Historical volatility is basically always computed using log returns. I would flip the question around and ask why you would not want to use log returns.

If the results are very different, you need to ask yourself a few questions:

  • How does the forecast perform: Ultimately, a good forecast should fit. if you realize one model fits better for your data, stick to the one that works better.
  • Is the model fit properly? Are the residuals weak white noise after you specified the mean equation? You need to remove the sample mean if it is significantly different from zero. If you use the log returns, you're essentially making the assumption that there is no conditional variation in the mean.

Side remark, one weakness is that the model assumes positive and negative shocks have the same effects on volatility. In reality, financial assets react different to positive and negative shocks. You can use EGARCH (exponential GARCH) to mitigate this.

Richard Hardy
  • 3,146
  • 1
  • 17
  • 30
AKdemy
  • 9,269
  • 1
  • 23
  • 90
0

If your strategy is mostly based on volatility, I suggest you to use a different approach; by using a GARCH model you're assuming that the volatility is unknown but you can model it. The Realized Variance is a non-parametric method that allows you to calculate the daily variance knowing the intraday returns (generalizing, the variance in a given timeframe knowing lower timeframe returns). The Realized volatility - square root of the realized variance - is given by: $\displaystyle\text{RV}=\sqrt{\sum_{t=1}^Tr_t^2}$ but this formula is highly biased due to the so-called "microstructure noise". There are multiple valid methods in literature that let you have an unbiased estimation of the RV, like the Realized kernel (a practical and clear approach to this method can be found here). The big difference between RV estimation and GARCH model is that the first one is non-parametric and gives you the possibility to retrieve the volatility directly from your time series, without potential bias given by the optimization and parametrization of the GARCH model. This approach is a starting point to the volatility forecasting, that need a different model - e.g. HAR model of Corsi(2009), Bayesian method by Liu and Maheu (2009), bagging by Hillebrand and Modeiros (2010) and so on.

  • 2
    You should mention this requires data on a much higher frequency. As such it is not a direct alternative for GARCH. Also, regarding your first paragraph: the relationship between volatility at different frequencies depends on the data generating process. Without further assumptions on that, one cannot infer daily volatility from 5-minute (or some other short interval) volatility or vice versa. From this perspective, the 5-minute RV being a nonparametric method does not help when we are after daily volatility. – Richard Hardy Jun 30 '23 at 14:22