Highest Voted Questions - Statistical Analysis Stack Exchange

42

votes

4 answers

(Why) do overfitted models tend to have large coefficients?

I imagine that the larger a coefficient on a variable is, the more ability the model has to "swing" in that dimension, providing an increased opportunity to fit noise. Although I think I've got a reasonable sense of the relationship between the…

asked Jul 13 '13 at 01:30

David Marx

7,127

42

votes

5 answers

How to translate the results from lm() to an equation?

We can use lm() to predict a value, but we still need the equation of the result formula in some cases. For example, add the equation to plots.

asked Jul 08 '13 at 03:20

user27736

429

42

votes

2 answers

How to find a good fit for semi-sinusoidal model in R?

I want to assume that the sea surface temperature of the Baltic Sea is the same year after year, and then describe that with a function / linear model. The idea I had was to just input year as a decimal number (or num_months/12) and get out what the…

asked May 31 '13 at 06:17

GaRyu

523

42

votes

3 answers

ANOVA on binomial data

I am analyzing an experimental data set. The data consists of a paired vector of treatment type and a binomial outcome: Treatment Outcome A 1 B 0 C 0 D 1 A 0 ... In the outcome column, 1…

asked Jan 03 '11 at 22:04

speciousfool

563

42

votes

3 answers

Why is it that my colleagues and I learned opposite definitions for test and validation sets?

In my master's program I learned that when building a ML model you: train the model on the training set compare the performance of this against the validation set tweak the settings and repeat steps 1-2 when you are satisfied, compare the final…

asked May 24 '21 at 13:59

Jacob Myer

595
4
8

42

votes

6 answers

Inclusion of lagged dependent variable in regression

I'm very confused about if it's legitimate to include a lagged dependent variable into a regression model. Basically I think if this model focuses on the relationship between the change in Y and other independent variables, then adding a lagged…

asked Mar 17 '13 at 07:19

user22109

451
1
4
3

42

votes

4 answers

Explanation of finite population correction factor?

I understand that when sampling from a finite population and our sample size is more than 5% of the population, we need to make a correction on the sample's mean and standard error using this formula: $\hspace{10mm} FPC=\sqrt{\frac{N-n}{N-1}}$ Where…

asked Dec 05 '10 at 09:40

Sara

1,487

42

votes

2 answers

Which statistical model is being used in the Pfizer study design for vaccine efficacy?

I know there's a similar question here: How to calculate 95% CI of vaccine with 90% efficacy? but it doesn't have an answer, at the moment. Also, my question is different: the other question asks how to compute VE, using functions from a R package.…

asked Nov 17 '20 at 10:14

DeltaIV

17,954

42

votes

2 answers

Purpose of the link function in generalized linear model

What is the purpose of the link function as a component of the generalized linear model? Why do we need it? Wikipedia states: It can be convenient to match the domain of the link function to the range of the distribution function's mean What's the…

asked Jan 26 '13 at 17:03

Chris

1,339

42

votes

1 answer

Quantile regression: Which standard errors?

The summary.rq function from the quantreg vignette provides a multitude of choices for standard error estimates of quantile regression coefficients. What are the special scenarios where each of these becomes optimal/desirable? "rank" which produces…

asked Dec 22 '12 at 11:19

Jase

2,246

42

votes

2 answers

Which search range for determining SVM optimal C and gamma parameters?

I am using SVM for classification and I am trying to determine the optimal parameters for linear and RBF kernels. For the linear kernel I use cross-validated parameter selection to determine C and for the RBF kernel I use grid search to determine C…

asked Nov 19 '12 at 16:33

Kywia

421

42

votes

2 answers

Pooling vs. stride for downsampling

Pooling and stride both can be used to downsample the image. Let's say we have an image of 4x4, like below and a filter of 2x2. Then how do we decide whether to use (2x2 pooling) vs. (stride of 2)?

deep-learning

asked Jan 16 '19 at 07:53

JungIn Choi

541

42

votes

6 answers

Why do we use loss functions to estimate a model instead of evaluation metrics like accuracy?

When building a learning algorithm we are looking to maximize a given evaluation metric (say accuracy), but the algorithm will try to optimize a different loss function during learning (say MSE/entropy). Why are the evaluation metrics not used as…

asked Nov 28 '18 at 19:10

Jesús Ros

548

42

votes

15 answers

The Monty Hall Problem - where does our intuition fail us?

From Wikipedia : Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another…

asked Jul 21 '10 at 04:30

Rizwan Kassim

765

42

votes

3 answers

How to prove that the radial basis function is a kernel?

How to prove that the radial basis function $k(x, y) = \exp(-\frac{||x-y||^2)}{2\sigma^2})$ is a kernel? As far as I understand, in order to prove this we have to prove either of the following: For any set of vectors $x_1, x_2, ..., x_n$ matrix…

asked Sep 03 '12 at 21:19

Leo

2,634

Most Popular