Highest Voted Questions - Statistical Analysis Stack Exchange

68

votes

1 answer

Why is the square root transformation recommended for count data?

It is often recommended to take the square root when you have count data. (For some examples on CV, see @HarveyMotulsky's answer here, or @whuber's answer here.) On the other hand, when fitting a generalized linear model with a response variable…

asked Dec 22 '12 at 03:11

gung - Reinstate Monica

145,122

68

votes

4 answers

Why is expectation the same as the arithmetic mean?

Today I came across a new topic called the Mathematical Expectation. The book I am following says, expectation is the arithmetic mean of random variable coming from any probability distribution. But, it defines expectation as the sum of product of…

asked Jun 13 '12 at 11:07

pranphy

961

68

votes

6 answers

What is the difference between estimation and prediction?

For example, I have historical loss data and I am calculating extreme quantiles (Value-at-Risk or Probable Maximum Loss). The results obtained is for estimating the loss or predicting them? Where can one draw the line? I am confused.

asked Oct 31 '11 at 21:00

melon

681

68

votes

4 answers

Won't highly-correlated variables in random forest distort accuracy and feature-selection?

In my understanding, highly correlated variables won't cause multi-collinearity issues in random forest model (Please correct me if I'm wrong). However, on the other way, if I have too many variables containing similar information, will the model…

asked Mar 13 '15 at 14:46

Yoki

929

68

votes

6 answers

Is the "hybrid" between Fisher and Neyman-Pearson approaches to statistical testing really an "incoherent mishmash"?

There exists a certain school of thought according to which the most widespread approach to statistical testing is a "hybrid" between two approaches: that of Fisher and that of Neyman-Pearson; these two approaches, the claim goes, are "incompatible"…

asked Aug 21 '14 at 12:54

amoeba

104,745

67

votes

4 answers

Comparing SVM and logistic regression

Can someone please give me some intuition as to when to choose either SVM or LR? I want to understand the intuition behind what is the difference between the optimization criteria of learning the hyperplane of the two, where the respective aims are…

asked Apr 26 '14 at 23:01

user41799

721
1
6
5

67

votes

2 answers

Why only three partitions? (training, validation, test)

When you are trying to fit models to a large dataset, the common advice is to partition the data into three parts: the training, validation, and test dataset. This is because the models usually have three "levels" of parameters: the first…

asked Apr 08 '11 at 14:45

charles.y.zheng

7,936

67

votes

6 answers

Efficient online linear regression

I'm analysing some data where I would like to perform ordinary linear regression, however this is not possible as I am dealing with an on-line setting with a continuous stream of input data (which will quickly get too large for memory) and need to…

asked Feb 05 '11 at 18:25

mikera

1,005

67

votes

14 answers

What is the most surprising characterization of the Gaussian (normal) distribution?

A standardized Gaussian distribution on $\mathbb{R}$ can be defined by giving explicitly its density: $$ \frac{1}{\sqrt{2\pi}}e^{-x^2/2}$$ or its characteristic function. As recalled in this question it is also the only distribution for which the…

asked Nov 09 '10 at 20:19

robin girard

6,705

67

votes

5 answers

How does one interpret SVM feature weights?

I am trying to interpret the variable weights given by fitting a linear SVM. (I'm using scikit-learn): from sklearn import svm svm = svm.SVC(kernel='linear') svm.fit(features, labels) svm.coef_ I cannot find anything in the documentation that…

asked Oct 11 '12 at 20:48

Austin Richardson

1,008

67

votes

5 answers

What should I do when my neural network doesn't generalize well?

I'm training a neural network and the training loss decreases, but the validation loss doesn't, or it decreases much less than what I would expect, based on references or experiments with very similar architectures and data. How can I fix this? As…

asked Sep 07 '18 at 09:12

DeltaIV

17,954

67

votes

4 answers

Regression for an outcome (ratio or fraction) between 0 and 1

I am thinking of building a model predicting a ratio $a/b$, where $a \le b$ and $a > 0$ and $b > 0$. So, the ratio would be between $0$ and $1$. I could use linear regression, although it doesn't naturally limit to 0..1. I have no reason to believe…

asked May 23 '12 at 22:13

dfrankow

3,376

67

votes

10 answers

What is the difference between prediction and inference?

I'm reading through "An Introduction to Statistical Learning" . In chapter 2, they discuss the reason for estimating a function $f$. 2.1.1 Why Estimate $f$? There are two main reasons we may wish to estimate f : prediction and inference. We discuss…

asked Nov 03 '16 at 14:47

user1592380

771

67

votes

2 answers

Are mean normalization and feature scaling needed for k-means clustering?

What are the best (recommended) pre-processing steps before performing k-means?

asked Jan 17 '12 at 09:55

pedrosaurio

1,353

67

votes

7 answers

Where did the frequentist-Bayesian debate go?

The world of statistics was divided between frequentists and Bayesians. These days it seems everyone does a bit of both. How can this be? If the different approaches are suitable for different problems, why did the founding fathers of statistics did…

asked Jan 03 '12 at 20:08

JohnRos

5,684

Most Popular