3

I have a panel of daily stock prices $\{Y_{it}\}$, $t \in \{1,...,1000\}$ and some event that occurs at $t=700$ which causes the average stock price to decrease by about 10% over the next 15 days.

How do I answer the question "which stocks fell first"? I want to know of the characteristics of stocks that tended to be correlated with very early falls (like, first 1 or 2 days). I have many ideas for sub-sample analysis but none that I want to use involving econometrics.

Broad econometric-based ideas welcome.

Can survival analysis be used here?

Note; I did make an earlier post on this issue that was answered, but I constrained the question way too much and didn't get what I wanted.

EDIT: Please don't absolutely answer the question (up to model specification). I just want ideas/suggestions of areas that might be good and a very general approach/suggestion. Then I want to develop it all myself from there.

gui11aume
  • 14,703
  • 1
    I think yuou first need to formulate some hypotheses about the qualities that would lead to fast fall. Then you could use survival analysis to look at those qualities. But if you try survival analysis as is, you seem to have 1000 variables and 1000 observations. That obviously won't work. – Peter Flom Sep 07 '12 at 14:47
  • @PeterFlom That's fine, as through sub-sample analysis I've identified 2 non time-dependent characteristics, $p_i$ and $q_i$, that are strongly related to immediate falls just after t=700. –  Sep 07 '12 at 23:09
  • I probably don't understand your question. There is no uncertainty: Which stocks fell first? Just look at the panel. – Zen Sep 10 '12 at 00:08
  • 1
    @Zen I can't exactly say in my thesis that "I looked at the excel file and saw that a,b,c,d,e,f seemed to fall aggressively and early". –  Sep 10 '12 at 04:06
  • 2
    Nice comment! Very sincere. I was thinking if you could restate the problem like this. You have "bear events" at days $t_1,t_2,\dots,t_n$, and stocks prices at the end of those days, and you want to say which stock is more probable to fall early when those kind of events happen again in the future. So, it is a problem of prediction. – Zen Sep 10 '12 at 04:21
  • @Zen Thanks for the tip. Do you have a branch of econometrics that you think I should be looking to in particular? –  Sep 10 '12 at 04:28
  • Hi, Shaniqueia. It is not my area of research but I urge you to search through the literature because it is highly probable that this problem has been studied before. – Zen Sep 10 '12 at 18:06

3 Answers3

4

The question you want to ask I think is not "what stocks fell first," that is clear as Zen said, it seems that you want to know what characteristics are common to stocks that fell "first." You must define what that means, ie. what the cutoff point for stocks that fell first is. You mention one day or two days after t=700 in your question, as long as you have a justification for that timeline, that is fine. Just define the period you want to analyse. Robustness checks could be performed by examining different period lengths.

I think that if you do that, you can create an indicator variable (1/0) for whether or not the stock "fell early", maybe deciding that using the SVM classifier as mentioned above. I would recommend using a non parametric technique to compare the stocks. If you were interested in a causal interpretation of whether some variable caused prices to drop first, you could use a matching estimator (if your data fulfills the relevant criteria) that compares the 1 group with the 0 group. Alternatively, you could analyse the data parametrically, again using the indicator variable with a probit or logit model, or if you are looking for something causal use an IV probit estimator. There are many more options. Point is, I think you should focus on modelling the "early fall", and use a theory-based choice for what is considered early.

This is based on a slightly different interpretation of what you described in the question, but I hope it is constructive for you, since I think as it is written you do not have a hypothesis that requires any regression or other inferential statistics.

kirk
  • 431
3

I'm not completely clear on the question. It sounds like you may have a supervised learning problem - you have a training set, and want to use that to predict which other stocks are likely to behave in a similar way. Alternatively, perhaps you want to know what your dependent variable really is? What does it mean to "fall early"? Which technique you use will largely be based on whether you're looking for a model which is most interpretable, or a model which is most predictive.

Given that it sounds like you know which stocks fell early (i.e. you know which category they belong to) then you could technically just use SVM for classification. This would allow you to find a maximally separating hyperplane in the feature space, and would probably be the most effective way to predict whether other stocks will behave the same way. LDA, logistic regression, neural networks, random forests can all create strong classifiers, but interpretability of the latter methods is low. If this is what you're looking for, then you can easily create a predictor, as you have known groups.

However, I think the main problem you're going to experience is that you haven't really specified your dependent variable well. You've mentioned that you know which group a variable falls into, but it sounds like you just want to use statistics to justify that decision - generally, that seems like bad practise. You can't use statistics to justify the assignment of a given subject to a group, that's a matter for logic.

Survival analysis would let you analyse the survival curve, but it would require you to set a point below the stock is considered "dead" to the analysis. That point will still be somewhat arbitrary. Because of that, perhaps the real dependent variable you want to analyse is actually some aspect of the price fluctuation following E, such as the volatility, derivative between t1 and t2 etc. Once you decide what you really want to predict, you'll have an easier time choosing the correct technique.

Because of that, I think you need to figure out what it means for a stock to "fall early". Logic is the best way, but if you want to handle things in an atheoretical, data driven way, the best way to do that may be:

  • Generate a set of metrics (similar to the ones I mentioned, volatility, derivatives, second derivatives, over various periods)
  • See which of those concrete mathematical measures are most correlated with this predefined group of stocks which "fell early".

For example, you might find that having an inflexion point (after smoothing the curve with some parameters) on days t+0 - t+2 was highly correlated with being in the "fell early" group. You may also want to make sure that the measures you're testing (eg. whether there was an inflexion point) are representing significant features in the data, or simply random variability. In pursuit of that goal, you may look to historical data to ascertain the probability that features of that magnitude appear regularly.

analystic
  • 725
1

There is a significant literature on asset pricing anomalies (quant.stackexchange.com may provide some guidance as well). The type that you're most interested in would be event studies. For instance, a well-known anomaly is the post-earnings announcement drift (PEAD), which is where stocks tend to drift upward following earnings surprises. One benefit when investigating this is that groups of stocks will release earnings on the same day, but you always have a control group since earnings aren't all released on the same day.

Since you don't have a control group and want to investigate what characteristics are important, one approach is to rely on portfolio sorts. For instance, you sort the relevant stocks by beta, market capitalization, price to book ratio, momentum, or other common factors and create a portfolio that goes long the top 20% and short the bottom 20% (be sure to clearly describe your rebalancing rules). You can then analyze how the performance of these portfolios changed before and after your event. For instance, assuming the market went up over this period, I would anticipate the high beta stocks to outperform the low beta stocks.

John
  • 2,287
  • This is useful in general but I'm looking specifically at how to test the characteristics that lead to very quick falls after the event date. –  Sep 13 '12 at 23:44
  • It can't hurt to begin with the portfolio sorts to get a sense of the data. – John Sep 13 '12 at 23:46
  • I have already used an alternative methodology to answer this question. Beta is one of the important determinants for general large falls over the entire event window. –  Sep 13 '12 at 23:48