Highest Voted Questions - Statistical Analysis Stack Exchange

85

votes

2 answers

What is global max pooling layer and what is its advantage over maxpooling layer?

Can somebody explain what is a global max pooling layer and why and when do we use it for training a neural network. Do they have any advantage over ordinary max pooling layer?

asked Jan 20 '17 at 16:55

Eka

2,251

85

votes

2 answers

What is a "kernel" in plain English?

There are several distinct usages: kernel density estimation kernel trick kernel smoothing Please explain what the "kernel" in them means, in plain English, in your own words.

asked Sep 09 '10 at 00:15

Neil McGuigan

9,872

85

votes

6 answers

Why is it that natural log changes are percentage changes? What is about logs that makes this so?

Can somebody explain how the properties of logs make it so you can do log linear regressions where the coefficients are interpreted as percentage changes?

asked Nov 04 '16 at 15:07

thewhitetie

1,057
1
8
7

85

votes

6 answers

Variable selection for predictive modeling really needed in 2016?

This question has been asked on CV some yrs ago, it seems worth a repost in light of 1) order of magnitude better computing technology (e.g. parallel computing, HPC etc) and 2) newer techniques, e.g. [3]. First, some context. Let's assume the goal…

asked May 28 '16 at 20:13

horaceT

3,352

85

votes

14 answers

When (if ever) is a frequentist approach substantively better than a Bayesian?

Background: I do not have an formal training in Bayesian statistics (though I am very interested in learning more), but I know enough--I think--to get the gist of why many feel as though they are preferable to Frequentist statistics. Even the…

asked Feb 04 '16 at 16:27

jsakaluk

5,514
1
23
47

84

votes

5 answers

What are good RMSE values?

Suppose I have some dataset. I perform some regression on it. I have a separate test dataset. I test the regression on this set. Find the RMSE on the test data. How should I conclude that my learning algorithm has done well, I mean what properties…

asked Apr 16 '13 at 21:03

Shishir Pandey

1,101

84

votes

1 answer

Help me understand Support Vector Machines

I understand the basics of what a Support Vector Machines' aim is in terms of classifying an input set into several different classes, but what I don't understand is some of the nitty-gritty details. For starters, I'm a bit confused by the use of…

asked Oct 24 '10 at 15:11

rohanbk

1,257

84

votes

10 answers

What are the major philosophical, methodological, and terminological differences between econometrics and other statistical fields?

Econometrics has substantial overlap with traditional statistics, but often uses its own jargon about a variety of topics ("identification," "exogenous," etc.). I once heard an applied statistics professor in another field comment that frequently…

asked Oct 14 '11 at 03:45

Ari B. Friedman

3,591

84

votes

9 answers

Probability of a single real-life future event: What does it mean when they say that "Hillary has a 75% chance of winning"?

As the election is a one time event, it is not an experiment that can be repeated. So exactly what does the statement "Hillary has a 75% chance of winning" technically mean? I am seeking a statistically correct definition not an intuitive or…

asked Jul 24 '16 at 13:59

pitosalas

963

84

votes

6 answers

Does no correlation imply no causality?

I know that correlation does not imply causality but does an absence of correlation imply absence of causality?

asked Jul 03 '16 at 14:54

user2088176

945
1
6
9

84

votes

6 answers

What is an intuitive explanation for how PCA turns from a geometric problem (with distances) to a linear algebra problem (with eigenvectors)?

I've read a lot about PCA, including various tutorials and questions (such as this one, this one, this one, and this one). The geometric problem that PCA is trying to optimize is clear to me: PCA tries to find the first principal component by…

asked Jun 08 '16 at 22:20

stackoverflowuser2010

3,550

83

votes

2 answers

XKCD's modified Bayes theorem: actually kinda reasonable?

I know this is from a comic famous for taking advantage of certain analytical tendencies, but it actually looks kind of reasonable after a few minutes of staring. Can anyone outline for me what this "modified Bayes theorem" is doing?

asked Oct 16 '18 at 00:05

eric_kernfeld

5,209

83

votes

6 answers

Choosing a clustering method

When using cluster analysis on a data set to group similar cases, one needs to choose among a large number of clustering methods and measures of distance. Sometimes, one choice might influence the other, but there are many possible combinations of…

asked Oct 18 '10 at 15:58

Brett

6,194
3
33
41

83

votes

28 answers

Examples for teaching: Correlation does not mean causation

There is an old saying: "Correlation does not mean causation". When I teach, I tend to use the following standard examples to illustrate this point: number of storks and birth rate in Denmark; number of priests in America and alcoholism; in the…

asked Jul 19 '10 at 19:31

csgillespie

13,029

83

votes

3 answers

Why do neural network researchers care about epochs?

An epoch in stochastic gradient descent is defined as a single pass through the data. For each SGD minibatch, $k$ samples are drawn, the gradient computed and parameters are updated. In the epoch setting, the samples are drawn without…

asked Oct 24 '16 at 02:44

Sycorax

90,934

Most Popular