Most Popular

1500 questions
42
votes
4 answers

(Why) do overfitted models tend to have large coefficients?

I imagine that the larger a coefficient on a variable is, the more ability the model has to "swing" in that dimension, providing an increased opportunity to fit noise. Although I think I've got a reasonable sense of the relationship between the…
David Marx
  • 7,127
42
votes
5 answers

How to translate the results from lm() to an equation?

We can use lm() to predict a value, but we still need the equation of the result formula in some cases. For example, add the equation to plots.
user27736
  • 429
42
votes
2 answers

How to find a good fit for semi-sinusoidal model in R?

I want to assume that the sea surface temperature of the Baltic Sea is the same year after year, and then describe that with a function / linear model. The idea I had was to just input year as a decimal number (or num_months/12) and get out what the…
GaRyu
  • 523
42
votes
3 answers

ANOVA on binomial data

I am analyzing an experimental data set. The data consists of a paired vector of treatment type and a binomial outcome: Treatment Outcome A 1 B 0 C 0 D 1 A 0 ... In the outcome column, 1…
42
votes
3 answers

Why is it that my colleagues and I learned opposite definitions for test and validation sets?

In my master's program I learned that when building a ML model you: train the model on the training set compare the performance of this against the validation set tweak the settings and repeat steps 1-2 when you are satisfied, compare the final…
Jacob Myer
  • 595
  • 4
  • 8
42
votes
6 answers

Inclusion of lagged dependent variable in regression

I'm very confused about if it's legitimate to include a lagged dependent variable into a regression model. Basically I think if this model focuses on the relationship between the change in Y and other independent variables, then adding a lagged…
user22109
  • 451
  • 1
  • 4
  • 3
42
votes
4 answers

Explanation of finite population correction factor?

I understand that when sampling from a finite population and our sample size is more than 5% of the population, we need to make a correction on the sample's mean and standard error using this formula: $\hspace{10mm} FPC=\sqrt{\frac{N-n}{N-1}}$ Where…
Sara
  • 1,487
42
votes
2 answers

Which statistical model is being used in the Pfizer study design for vaccine efficacy?

I know there's a similar question here: How to calculate 95% CI of vaccine with 90% efficacy? but it doesn't have an answer, at the moment. Also, my question is different: the other question asks how to compute VE, using functions from a R package.…
DeltaIV
  • 17,954
42
votes
2 answers

Purpose of the link function in generalized linear model

What is the purpose of the link function as a component of the generalized linear model? Why do we need it? Wikipedia states: It can be convenient to match the domain of the link function to the range of the distribution function's mean What's the…
Chris
  • 1,339
42
votes
1 answer

Quantile regression: Which standard errors?

The summary.rq function from the quantreg vignette provides a multitude of choices for standard error estimates of quantile regression coefficients. What are the special scenarios where each of these becomes optimal/desirable? "rank" which produces…
Jase
  • 2,246
42
votes
2 answers

Which search range for determining SVM optimal C and gamma parameters?

I am using SVM for classification and I am trying to determine the optimal parameters for linear and RBF kernels. For the linear kernel I use cross-validated parameter selection to determine C and for the RBF kernel I use grid search to determine C…
Kywia
  • 421
42
votes
2 answers

Pooling vs. stride for downsampling

Pooling and stride both can be used to downsample the image. Let's say we have an image of 4x4, like below and a filter of 2x2. Then how do we decide whether to use (2x2 pooling) vs. (stride of 2)?
42
votes
6 answers

Why do we use loss functions to estimate a model instead of evaluation metrics like accuracy?

When building a learning algorithm we are looking to maximize a given evaluation metric (say accuracy), but the algorithm will try to optimize a different loss function during learning (say MSE/entropy). Why are the evaluation metrics not used as…
42
votes
15 answers

The Monty Hall Problem - where does our intuition fail us?

From Wikipedia : Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another…
42
votes
3 answers

How to prove that the radial basis function is a kernel?

How to prove that the radial basis function $k(x, y) = \exp(-\frac{||x-y||^2)}{2\sigma^2})$ is a kernel? As far as I understand, in order to prove this we have to prove either of the following: For any set of vectors $x_1, x_2, ..., x_n$ matrix…
Leo
  • 2,634