I am working with a model that can be described roughly as
$$
\left\{ \begin{array}{ll} y^* & = & \beta_0 + x'\beta + \epsilon_{\{x,v\} }
\\ w^* & = & \gamma_0 + v'\gamma + \delta_{\{x,v\} }
\\ y & = & 1[y^* >0 ]
\\ w & = & 1[w^* >0 ]
\end{array} \right.
$$
although one or the other of $y$ or $w$ can be ordinal. (I am fitting this in generic terms with Stata cmp package by Roodman (2011).)
When the sets of regressors $x$ and $v$ are empty, $y^*$ and $w^*$ are strongly correlated. (and thus are their discrete versions $y$ and $w$). It is this correlation that I am aiming my model selection at: I need to make it small, which I know I can make happen as there are common influences on $y$ and $w$ from the explanatory variables $x$ and $v$ (which I kept distinct in my notation, but they will likely have a lot of common variables in practice).
My questions are:
- What are the ways to conduct model selection here? So far, I am using a greedy search algorithm that adds one variable at a time to $x$ and/or $v$, check how the magnitude (or, better, significance) of the $\epsilon$-$\delta$ correction is affected, and chooses the variable that provides the best improvement.
- Do I need to worry about the excluded variables needed for identification? (A somewhat longer, and a more accurate, story is that the $y$ equation is a selection equation, and $w$ equation is the response equation that only applies to cases with $y=1$, in a variation of the Heckman model. Whether the selection actually depends on $w$, as is implied in the labor supply model used by Heckman, is anybody's telling, but it does not need to be, in the context of my application).
- What other questions should I be asking?
A somewhat similar problem of modifying your model until a certain test statistic becomes insignificant is faced in structural equation modeling, and I don't think they have an answer that I would find satisfactory.