0

Possible Duplicate:
passing a vector of variables into lm() formula

I am obviously new to R so sorry beforehand ;)

I am trying to implement an interative QLR type test, but because my code is going to be part of a paper thing, I really want to put it in a seperate file as a function.

Goal is to iteratively F-Test a model over all observations for structural breaks, until an amount of (let's say) five or so structural breaks is found and implemented in the model as binary dummy variable. Interaction will be tested and added manually.

I am sorry that there is no code yet, because I haven't really written the function yet. But here is how I imagine it to work:

  1. There is a wrapper function that recieves the model and variables (I could also just access the global objects I guess?) as an argument.

  2. It then calls the QLR function. It gives back the index with the highest F-Value and also the index of the date AFTER the first index that is QLR significant.)

    The reason is that the F-Test will give me a multicollinearity error if I test a dummy variable that is to be 1 during an interval where a previous found dummy is already 1. So if my initial dummy variable is at a very low index, I will not find any breaks after it. If it is at 500 and I set it to 1 for ALL dates after that, the next iteration can only find significant F-Values for index LESS than 500.

  3. The wrapper then calls a function that creates a dummy time series, which is 0 until the first index, then 1 and then 0 again when the second index is reached.

    Let's say the most significant F-value is at index 500, while another significant value is found at 1000, 1500 and 2000 etc. The dummy function would then set the dummy to be 1 on the interval of 500-999.

  4. The new model with the dummy variable D1 will be returned to the QLR function, which repeats the process for a new variable D2 and so on.

Now I realize this is a bit convoluted and it will give me on-off type dummies that give me the total influence on the model at a date, rather than the influence of an effect. But that's fine with the context and I don't really know how to get around the multicollinearity problem in the LHT function.

Anyway, what I need to implement this is the ability to generate and dynamically pass a model to the QLR function and also dynamically add the new dummy variables to the model.

I would love it to be not restricted to a certain number of dummies, since I don't know beforehand how many might test QLR significant. Because of this, I would also need the ability to add variables with dynamic names. I guess I could do this as a vector though, there's gotta be an add-row function, right?

How would you go about passing model+variables dynamically to a function?

Community
  • 1
  • 1
IMA
  • 261
  • 2
  • 10
  • Welcome to SO! Your question has been asked many times in different forms; a solution is to use `paste` as shown here for example: http://stackoverflow.com/questions/9238038/passing-a-vector-of-variables-into-lm-formula. – flodel Sep 30 '12 at 11:13
  • @flodel:Thanks! Still, can you guys maybe tell me a good way to aggregate and access zoo timeseries in a dynamic fashion
    ? I want to add to a dummy vector with each iteration and I want to be able to access that vector by using a variable.
    I can do cbind(D, D1), but then I can only access via the $ symbol, so only if I know the variable name. I want the variable name to be dynamic though, since I do not know how many dummys I will use. Can I do a D
    – IMA Sep 30 '12 at 12:53

0 Answers0