Highest Voted Questions - Statistical Analysis Stack Exchange

143

votes

4 answers

Is it possible to have a pair of Gaussian random variables for which the joint distribution is not Gaussian?

Somebody asked me this question in a job interview and I replied that their joint distribution is always Gaussian. I thought that I can always write a bivariate Gaussian with their means and variance and covariances. I am wondering if there can be a…

asked Jun 09 '12 at 22:31

MarkSAlen

2,927

142

votes

9 answers

Is Facebook coming to an end?

Recently, this paper has received a lot of attention (e.g. from WSJ). Basically, the authors conclude that Facebook will lose 80% of its members by 2017. They base their claims on an extrapolation of the SIR model, a compartmental model frequently…

asked Jan 23 '14 at 16:58

LessFaceMoreBook

1,033

142

votes

8 answers

Why does the Cauchy distribution have no mean?

From the distribution density function we could identify a mean (=0) for Cauchy distribution just like the graph below shows. But why do we say Cauchy distribution has no mean?

asked Sep 10 '12 at 15:28

Flying pig

6,239

141

votes

9 answers

Obtaining knowledge from a random forest

Random forests are considered to be black boxes, but recently I was thinking what knowledge can be obtained from a random forest? The most obvious thing is the importance of the variables, in the simplest variant it can be done just by calculating…

asked Jan 16 '12 at 11:09

Tomek Tarczynski

4,024

140

votes

3 answers

What if residuals are normally distributed, but y is not?

I've got a weird question. Assume that you have a small sample where the dependent variable that you're going to analyze with a simple linear model is highly left skewed. Thus you assume that $u$ is not normally distributed, because this would…

asked Jun 23 '11 at 06:00

MarkDollar

5,955

140

votes

4 answers

What is the difference between convolutional neural networks, restricted Boltzmann machines, and auto-encoders?

Recently I have been reading about deep learning and I am confused about the terms (or say technologies). What is the difference between Convolutional neural networks (CNN), Restricted Boltzmann machines (RBM) and Auto-encoders?

asked Sep 04 '14 at 20:52

RockTheStar

12,907
34
71
96

139

votes

10 answers

Bias and variance in leave-one-out vs K-fold cross validation

How do different cross-validation methods compare in terms of model variance and bias? My question is partly motivated by this thread: Optimal number of folds in $K$-fold cross-validation: is leave-one-out CV always the best choice?. The answer…

asked Jun 14 '13 at 20:14

Amelio Vazquez-Reina

19,346

139

votes

3 answers

What is the difference between linear regression and logistic regression?

What is the difference between linear regression and logistic regression? When would you use each?

asked May 28 '12 at 18:17

B Seven

2,913

139

votes

6 answers

How is it possible that validation loss is increasing while validation accuracy is increasing as well

I am training a simple neural network on the CIFAR10 dataset. After some time, validation loss started to increase, whereas validation accuracy is also increasing. The test loss and test accuracy continue to improve. How is this possible? It seems…

asked May 28 '17 at 14:13

Konstantin Solomatov

1,635

139

votes

8 answers

How to choose between t-test or non-parametric test e.g. Wilcoxon in small samples

Certain hypotheses can be tested using Student's t-test (maybe using Welch's correction for unequal variances in the two-sample case), or by a non-parametric test like the Wilcoxon paired signed rank test, the Wilcoxon-Mann-Whitney U test, or the…

asked Oct 29 '14 at 03:02

Silverfish

23,353
27
103
201

138

votes

8 answers

Is it necessary to scale the target value in addition to scaling features for regression analysis?

I'm building regression models. As a preprocessing step, I scale my feature values to have mean 0 and standard deviation 1. Is it necessary to normalize the target values also?

asked Aug 11 '14 at 14:44

user2806363

2,723

137

votes

4 answers

Nested cross validation for model selection

How can one use nested cross validation for model selection? From what I read online, nested CV works as follows: There is the inner CV loop, where we may conduct a grid search (e.g. running K-fold for every available model, e.g. combination of…

asked Jul 22 '13 at 15:53

Amelio Vazquez-Reina

19,346

136

votes

14 answers

What's wrong with XKCD's Frequentists vs. Bayesians comic?

This xkcd comic (Frequentists vs. Bayesians) makes fun of a frequentist statistician who derives an obviously wrong result. However it seems to me that his reasoning is actually correct in the sense that it follows the standard frequentist…

asked Nov 11 '12 at 15:56

repied2

1,667

136

votes

7 answers

Is there an intuitive interpretation of $A^TA$ for a data matrix $A$?

For a given data matrix $A$ (with variables in columns and data points in rows), it seems like $A^TA$ plays an important role in statistics. For example, it is an important part of the analytical solution of ordinary least squares. Or, for PCA, its…

asked Feb 09 '12 at 08:05

Alec

2,385

134

votes

5 answers

How does a Support Vector Machine (SVM) work?

How does a Support Vector Machine (SVM) work, and what differentiates it from other linear classifiers, such as the Linear Perceptron, Linear Discriminant Analysis, or Logistic Regression? * (* I'm thinking in terms of the underlying motivations for…

asked Feb 16 '12 at 13:25

tdc

7,569

Most Popular