Most Popular

1500 questions
51
votes
5 answers

Why do we need multivariate regression (as opposed to a bunch of univariate regressions)?

I just browsed through this wonderful book: Applied multivariate statistical analysis by Johnson and Wichern. The irony is, I am still not able to understand the motivation for using multivariate (regression) models instead of separate univariate…
KarthikS
  • 1,166
51
votes
3 answers

Are CDFs more fundamental than PDFs?

My stat prof basically said, if given one of the following three, you can find the other two: Cumulative distribution function Moment Generating Function Probability Density Function But my econometrics professor said CDFs are more fundamental…
51
votes
6 answers

Choosing variables to include in a multiple linear regression model

I am currently working to build a model using a multiple linear regression. After fiddling around with my model, I am unsure how to best determine which variables to keep and which to remove. My model started with 10 predictors for the DV. When…
cryptic_star
  • 1,147
51
votes
1 answer

What is the difference between Metropolis-Hastings, Gibbs, Importance, and Rejection sampling?

I have been trying to learn MCMC methods and have come across Metropolis-Hastings, Gibbs, Importance, and Rejection sampling. While some of these differences are obvious, i.e., how Gibbs is a special case of Metropolis-Hastings when we have the full…
user1398057
  • 2,365
  • 1
  • 20
  • 24
51
votes
6 answers

What algorithm is used in linear regression?

I usually hear about "ordinary least squares". Is that the most widely used algorithm used for linear regression? Are there reasons to use a different one?
Belmont
  • 1,373
51
votes
7 answers

Survival Analysis tools in Python

I am wondering if there are any packages for python that is capable of performing survival analysis. I have been using the survival package in R but would like to port my work to python.
MarkSAlen
  • 2,927
51
votes
1 answer

How to interpret and report eta squared / partial eta squared in statistically significant and non-significant analyses?

I have data that has eta squared values and partial eta squared values calculated as a measure of effect size for group mean differences. What is the difference between eta squared and partial eta squared? Can they both be interpreted using the…
51
votes
1 answer

Why is the sampling distribution of variance a chi-squared distribution?

The statement The sampling distribution of the sample variance is a chi-squared distribution with degree of freedom equals to $n-1$, where $n$ is the sample size (given that the random variable of interest is normally distributed). Source My…
Remi.b
  • 5,112
50
votes
4 answers

How to interpret coefficients from a polynomial model fit?

I'm trying to create a second order polynomial fit to some data I have. Let's say I plot this fit with ggplot(): ggplot(data, aes(foo, bar)) + geom_point() + geom_smooth(method="lm", formula=y~poly(x, 2)) I get: So, a second order fit…
user13907
  • 697
50
votes
3 answers

AIC,BIC,CIC,DIC,EIC,FIC,GIC,HIC,IIC --- Can I use them interchangeably?

On p. 34 of his PRNN Brian Ripley comments that "The AIC was named by Akaike (1974) as 'An Information Criterion' although it seems commonly believed that the A stands for Akaike". Indeed, when introducing the AIC statistic, Akaike (1974, p.719)…
Hibernating
  • 3,943
50
votes
4 answers

How are kernels applied to feature maps to produce other feature maps?

I am trying to understand the convolution part of convolutional neural networks. Looking at the following figure: I have no problems understanding the first convolution layer where we have 4 different kernels (of size $k \times k$), which we…
utdiscant
  • 1,570
50
votes
8 answers

Simple examples of uncorrelated but not independent $X$ and $Y$

Any hard-working student is a counterexample to "all students are lazy". What are some simple counterexamples to "if random variables $X$ and $Y$ are uncorrelated then they are independent"?
50
votes
1 answer

Negative values for AIC in General Mixed Model

I'm trying to select the best model by the AIC in the General Mixed Model test. The best model is the model with the lowest AIC, but all my AIC's are negative! So is the biggest negative AIC the lowest value? Or is the smallest negative AIC the…
50
votes
3 answers

What is the relationship between the mean squared error and the residual sum of squares function?

Looking at the Wikipedia definitions of: Mean Squared Error (MSE) Residual Sum of Squares (RSS) It looks to me that $$\text{MSE} = \frac{1}{N} \text{RSS} = \frac{1}{N} \sum (f_i -y_i)^2$$ where $N$ is he number of samples and $f_i$ is our…
Josh
  • 4,448
50
votes
2 answers

Gamma vs. lognormal distributions

I have an experimentally observed distribution that looks very similar to a gamma or lognormal distribution. I've read that the lognormal distribution is the maximum entropy probability distribution for a random variate $X$ for which the mean and…
OSE
  • 1,217