My question is very similar to this one, but with the Weibull distribution replacing the Poisson distribution.
Let's say I am analyzing the distribution of times between failures for an engine, with data over an entire calendar year. From past experience, I know that the Weibull distribution is appropriate here. I want to test whether or not the distribution in December is different from the distribution earlier in the year.
In other words, is it more likely that there is a single set of Weibull parameters that generated the entire year's data; or is it more likely that there are two sets of parameters, one for January-November, and another for December? How can I answer this question?
Option 1:
- Fit a Weibull distribution to the entire year's data, and find the likelihood. Call this Model 1.
- Fit one Weibull distribution to the data from January to November, and fit another one to the data from December. This is similar to a regression model with an indicator variable for "December". Call this Model 2.
- Somehow compare the log-likelihood/AIC values for Model 1 vs Model 2. From this post it seems like I could get the log-likelihood for Model 2 by just adding the individual log-likelihoods of each Weibull fit.
Option 2:
- Fit a Weibull distribution to January-November data.
- Do a likelihood ratio test to compare two separate fits on December data. The first fit will be unrestricted, while the second will be restricted to have the same parameter values as those found in Step 1.
- If the likelihood ratio test is not significant, conclude that there is no need for a separate fit for December data.
Would either of these approaches work?
I suspect that Option 1 is better, because it is asking whether there is some common vector of parameters such that the estimated parameters in each time period could be different solely due to random variation. On the other hand, Option 2 seems to ask, "Is the estimated parameter vector in the first time period also the best-fit parameter vector for the second time period?", which is a bit too specific.