Most Popular
1500 questions
143
votes
4 answers
Is it possible to have a pair of Gaussian random variables for which the joint distribution is not Gaussian?
Somebody asked me this question in a job interview and I replied that their joint distribution is always Gaussian. I thought that I can always write a bivariate Gaussian with their means and variance and covariances. I am wondering if there can be a…
MarkSAlen
- 2,927
142
votes
9 answers
Is Facebook coming to an end?
Recently, this paper has received a lot of attention (e.g. from WSJ). Basically, the authors conclude that Facebook will lose 80% of its members by 2017.
They base their claims on an extrapolation of the SIR model, a compartmental model frequently…
LessFaceMoreBook
- 1,033
142
votes
8 answers
Why does the Cauchy distribution have no mean?
From the distribution density function we could identify a mean (=0) for Cauchy distribution just like the graph below shows. But why do we say Cauchy distribution has no mean?
Flying pig
- 6,239
141
votes
9 answers
Obtaining knowledge from a random forest
Random forests are considered to be black boxes, but recently I was thinking what knowledge can be obtained from a random forest?
The most obvious thing is the importance of the variables, in the simplest variant it can be done just by calculating…
Tomek Tarczynski
- 4,024
140
votes
3 answers
What if residuals are normally distributed, but y is not?
I've got a weird question. Assume that you have a small sample where the dependent variable that you're going to analyze with a simple linear model is highly left skewed. Thus you assume that $u$ is not normally distributed, because this would…
MarkDollar
- 5,955
140
votes
4 answers
What is the difference between convolutional neural networks, restricted Boltzmann machines, and auto-encoders?
Recently I have been reading about deep learning and I am confused about the terms (or say technologies). What is the difference between
Convolutional neural networks (CNN),
Restricted Boltzmann machines (RBM) and
Auto-encoders?
RockTheStar
- 12,907
- 34
- 71
- 96
139
votes
10 answers
Bias and variance in leave-one-out vs K-fold cross validation
How do different cross-validation methods compare in terms of model variance and bias?
My question is partly motivated by this thread: Optimal number of folds in $K$-fold cross-validation: is leave-one-out CV always the best choice?. The answer…
Amelio Vazquez-Reina
- 19,346
139
votes
3 answers
What is the difference between linear regression and logistic regression?
What is the difference between linear regression and logistic regression?
When would you use each?
B Seven
- 2,913
139
votes
6 answers
How is it possible that validation loss is increasing while validation accuracy is increasing as well
I am training a simple neural network on the CIFAR10 dataset. After some time, validation loss started to increase, whereas validation accuracy is also increasing. The test loss and test accuracy continue to improve.
How is this possible? It seems…
Konstantin Solomatov
- 1,635
139
votes
8 answers
How to choose between t-test or non-parametric test e.g. Wilcoxon in small samples
Certain hypotheses can be tested using Student's t-test (maybe using Welch's correction for unequal variances in the two-sample case), or by a non-parametric test like the Wilcoxon paired signed rank test, the Wilcoxon-Mann-Whitney U test, or the…
Silverfish
- 23,353
- 27
- 103
- 201
138
votes
8 answers
Is it necessary to scale the target value in addition to scaling features for regression analysis?
I'm building regression models. As a preprocessing step, I scale my feature values to have mean 0 and standard deviation 1. Is it necessary to normalize the target values also?
user2806363
- 2,723
137
votes
4 answers
Nested cross validation for model selection
How can one use nested cross validation for model selection?
From what I read online, nested CV works as follows:
There is the inner CV loop, where we may conduct a grid search (e.g. running K-fold for every available model, e.g. combination of…
Amelio Vazquez-Reina
- 19,346
136
votes
14 answers
What's wrong with XKCD's Frequentists vs. Bayesians comic?
This xkcd comic (Frequentists vs. Bayesians) makes fun of a frequentist statistician who derives an obviously wrong result.
However it seems to me that his reasoning is actually correct in the sense that it follows the standard frequentist…
repied2
- 1,667
136
votes
7 answers
Is there an intuitive interpretation of $A^TA$ for a data matrix $A$?
For a given data matrix $A$ (with variables in columns and data points in rows), it seems like $A^TA$ plays an important role in statistics. For example, it is an important part of the analytical solution of ordinary least squares. Or, for PCA, its…
Alec
- 2,385
134
votes
5 answers
How does a Support Vector Machine (SVM) work?
How does a Support Vector Machine (SVM) work, and what differentiates it from other linear classifiers, such as the Linear Perceptron, Linear Discriminant Analysis, or Logistic Regression? *
(* I'm thinking in terms of the underlying motivations for…
tdc
- 7,569