2

I am currently trying to find a best model using R for Panel Data. I have a project on Corporate Governance in which I collected data of various companies from 2009-2014. I found the best fit using Backward Elimination and Forward Selection using T values in R which concur with themselves (I think its expected as well). But, after reading various articles in academic community I realize T values can give dubious results and to gain any sort of credibility for my results it will be best to use AIC or BIC criteria (even though its not a big improvement by the looks of things in terms of credibility but still its some improvement). As I understand, the definition of AIC is as follows:

AIC = -2log-likelihood+2p (for BIC it will be plogn)

For normal data with no time series, R has a pre-defined function stepAIC which selects variables according to AIC values till AIC value cannot decrease.

But for panel data I havent found any way to find AIC value. I have tried searching for this but in vain.

  • 4
    Stepwise regression should be avoided, for reasons that are well documented on this site. It is not only not good for prediction; it distorts all aspects of statistical inference. – Frank Harrell Apr 28 '19 at 12:09

2 Answers2

1

Hi I know it is an old Question but i had a similar problem:

I ended up using the lmer/lme4 packages and I was able to force option="ML" to force the use of the conventional Maximum likelihood function, instead of using the REML (restricted maximum likelihood) default option. so stepAIC from mass package could optimize by AIC.

  • What do you mean by force a ML function? – mdewey Aug 04 '16 at 15:58
  • mdewey - rlme functions by defaust uses “restricted maximum likelihood” method (REML) and I couldnt find a way to optimize by AIC using this REML default option. but I was able to run the optimization by AIC if setting the option="ML", using the standar Maximum likelihood function. please refer to: http://stats.stackexchange.com/questions/48671/what-is-restricted-maximum-likelihood-and-when-should-it-be-used – Alexandre Ludolf Aug 04 '16 at 17:49
0

I found this some years ago (I don't remember where):

aicbic_plm <- function(object, criterion) {

object is "plm", "panelmodel"

Lets panel data has index :index = c("Country", "Time")

sp = summary(object)

if(class(object)[1]=="plm"){ u.hat <- residuals(sp) # extract residuals df <- cbind(as.vector(u.hat), attr(u.hat, "index")) names(df) <- c("resid", "Country", "Time") c = length(levels(df$Country)) # extract country dimension t = length(levels(df$Time)) # extract time dimension np = length(sp$coefficients[,1]) # number of parameters n.N = nrow(sp$model) # number of data s.sq <- log( (sum(u.hat^2)/(n.N))) # log sum of squares

# effect = c(&quot;individual&quot;, &quot;time&quot;, &quot;twoways&quot;, &quot;nested&quot;),
# model = c(&quot;within&quot;, &quot;random&quot;, &quot;ht&quot;, &quot;between&quot;, &quot;pooling&quot;, &quot;fd&quot;)

# I am making example only with some of the versions:

if (sp<span class="math-container">$args$</span>model == &quot;within&quot; &amp; sp<span class="math-container">$args$</span>effect == &quot;individual&quot;){
  n = c
  np = np+n+1 # update number of parameters
}

if (sp<span class="math-container">$args$</span>model == &quot;within&quot; &amp; sp<span class="math-container">$args$</span>effect == &quot;time&quot;){
  T = t
  np = np+T+1 # update number of parameters
}

if (sp<span class="math-container">$args$</span>model == &quot;within&quot; &amp; sp<span class="math-container">$args$</span>effect == &quot;twoways&quot;){
  n = c
  T = t
  np = np+n+T # update number of parameters
}
aic &lt;- round(       2*np  +  n.N * (  log(2*pi) + s.sq  + 1 ),1)
bic &lt;- round(log(n.N)*np  +  n.N * (  log(2*pi) + s.sq  + 1 ),1)

if(criterion==&quot;AIC&quot;){
  names(aic) = &quot;AIC&quot;
  return(aic)
}
if(criterion==&quot;BIC&quot;){
  names(bic) = &quot;BIC&quot;
  return(bic)
}

} } ```