Highest Voted 'model-selection' Questions - Statistical Analysis Stack Exchange

24

votes

7 answers

Measures of model complexity

How can we compare complexity of two models with the same number of parameters? Edit 09/19: To clarify, model complexity is a measure of how hard it is to learn from limited data. When two models fit existing data equally well, a model with lower…

model-selection

asked Sep 18 '10 at 20:20

Yaroslav Bulatov

6,199
2
28
42

7

votes

1 answer

Hellwig's method of selection of variables

Hellwig's method is a method of selection of variables in linear model. It is widely used in Poland, probably only in Poland because it is really hard to find it in any scientific paper written in English. Description of method: $m_{k}$ - set of…

model-selection

asked Feb 15 '11 at 22:30

Tomek Tarczynski

4,024

6

votes

2 answers

Is it possible to compare the parsimony of models with the same number of parameters and explanatory variables?

Parsimony is often defined as the minimisation of unnecessary parameters or explanatory variables in a model. But models also have structure - functional forms that can change. Between two models that have the same number of parameters, is it…

model-selection

asked Jan 16 '15 at 06:41

naught101

5,453

4

votes

2 answers

Relationship between MDL and "difficulty of learning from data"

While trying to make sense of MDL and stochastic complexity, I found this previous question: Measures of model complexity, in which Yaroslav Bulatov defines model complexity as "how hard it is to learn from limited data." It is not clear to me how…

model-selection

asked Mar 15 '11 at 05:44

charles.y.zheng

7,936

4

votes

1 answer

Model selection aimed at making "misfit" statistically insignificant

I am working with a model that can be described roughly as $$ \left\{ \begin{array}{ll} y^* & = & \beta_0 + x'\beta + \epsilon_{\{x,v\} } \\ w^* & = & \gamma_0 + v'\gamma + \delta_{\{x,v\} } \\ y & = & 1[y^* >0 ] \\ w & = & 1[w^* >0 ]…

model-selection

asked Aug 12 '13 at 21:16

StasK

31,547
2
92
179

3

votes

1 answer

Does Akaike information criterion penalize model complexity any more than is necessary to avoid overfitting

The AIC penalizes complex models. Obviously a certain penalty for complex models is necessary to avoid overfitting of statistical models: otherwise we would favour a model which is simply a copy of the data itself, and that would tell us…

model-selection

asked Mar 05 '14 at 17:39

Sideshow Bob

1,485

3

votes

3 answers

Partial F-test vs Model Selection

I'm a first year statistics graduate student taking a course in regression. In the previous chapter we covered, we discussed partial F-tests for deciding whether to include a predictor variable. In the current chapter (which we just finished), we…

model-selection

asked Feb 28 '14 at 00:09

Biomath

291

3

votes

1 answer

Effective search space vs guided search space

In ISLR (Intro to Stat Learning using R by James, Witten, Hastie, Tibs), in the section on Forward Selection on page 208, the footer states: Though forward stepwise selection considers $p(p+1)/2 + 1$ models, it performs a guided search over the…

model-selection

asked Oct 14 '18 at 13:37

user650654

393

3

votes

1 answer

Model selection and assumed parameters in models

Suppose there are four models: Model 1: $y = ax$ Model 2: $y = ax^2$ Model 3: $y = a\sqrt{x}$ Model 4: $y = ax^\theta$ Model 4 is the most complex model with two parameters (the others have one parameter). If we do model selection (e.g., based on…

model-selection

asked Mar 28 '15 at 03:52

quibble

1,436

2

votes

1 answer

Is there a good specification error test against a generalized alternative?

Suppose I believe a sample is drawn from a population that is distributed according to some specified distributional family. I intend to estimate the parameters of the distribution using some appropriate method. However, somewhere along the way I…

model-selection

asked May 12 '13 at 03:31

andrewH

3,117

2

votes

1 answer

Comparing different probabilistic models using the log likelihood of held out data?

I'm reading a paper that compares different probabilistic models using the log likelihood of held-out data. This is just... wrong, correct? There's no meaningful way to compare the LL between two different models? If I'm correct, what is the right…

model-selection

asked Jun 19 '20 at 15:56

jds

1,694

2

votes

0 answers

What is wrong with this model selection procedure?

I have a set of ~400 observations and ~20 covariates. Some covariates are logged, sqrt'd or truncated versions of others, so lots of dependence in my model matrix. My response is a proportion. I would like to find the best quasi-binomial model with…

model-selection

asked Nov 24 '18 at 19:55

JTH

1,033

2

votes

1 answer

Model selection across multiple criteria (qualitative and quantitative)

I have two linear regression models on the same data, but where the response variable has been transformed using respectively the BoxCox transformation and the logit transformation. Therefore, I cannot use AIC or other maximum likelihood based…

model-selection

asked Dec 12 '15 at 11:48

pir

5,056

2

votes

1 answer

Why do we build the final model on the full data instead of just the training set?

I am going through the ISLR book (by Hastie, et al.). In the chapter on model validation, the author suggests that we build the final model on the full data instead of the training data only. The way I always understood the process is that we build…

model-selection

asked Sep 23 '15 at 23:43

States.the.Obvious

320

1

vote

0 answers

Can statistics be used to measure which model is the most accurate

I found three websites which lists the times of sunrises and sunsets in the place where I live. But as those times differ occasionally some minutes, I would like to know if there is a method to find out which one is the most correct prediction. I…

model-selection

asked Feb 17 '14 at 18:05

guest

73

Questions tagged [model-selection]