Highest Voted Questions - Statistical Analysis Stack Exchange

133

votes

29 answers

Free statistical textbooks

Are there any free statistical textbooks available?

asked Jul 19 '10 at 23:29

csgillespie

13,029

132

votes

4 answers

What does a "closed-form solution" mean?

I have come across the term "closed-form solution" quite often. What does a closed-form solution mean? How does one determine if a close-form solution exists for a given problem? Searching online, I found some information, but nothing in the context…

asked Sep 23 '13 at 23:31

arjsgh21

2,633

132

votes

14 answers

Maximum Likelihood Estimation (MLE) in layman terms

Could anyone explain to me in detail about maximum likelihood estimation (MLE) in layman's terms? I would like to know the underlying concept before going into mathematical derivation or equation.

asked Aug 19 '14 at 12:46

StatsUser

1,809

131

votes

4 answers

When to use gamma GLMs?

The gamma distribution can take on a pretty wide range of shapes, and given the link between the mean and the variance through its two parameters, it seems suited to dealing with heteroskedasticity in non-negative data, in a way that log-transformed…

asked Aug 16 '13 at 08:13

generic_user

13,339

131

votes

5 answers

What are the main differences between K-means and K-nearest neighbours?

I know that k-means is unsupervised and is used for clustering etc and that k-NN is supervised. But I wanted to know concrete differences between the two?

asked Apr 18 '13 at 17:15

nsc010

1,657

131

votes

4 answers

Softmax vs Sigmoid function in Logistic classifier?

What decides the choice of function ( Softmax vs Sigmoid ) in a Logistic classifier ? Suppose there are 4 output classes . Each of the above function gives the probabilities of each class being the correct output . So which one to take for a…

asked Sep 06 '16 at 15:46

mach

1,805

131

votes

4 answers

PCA and proportion of variance explained

In general, what is meant by saying that the fraction $x$ of the variance in an analysis like PCA is explained by the first principal component? Can someone explain this intuitively but also give a precise mathematical definition of what "variance…

asked Feb 10 '12 at 05:36

user9097

3,263

131

votes

4 answers

Differences between cross validation and bootstrapping to estimate the prediction error

I would like your thoughts about the differences between cross validation and bootstrapping to estimate the prediction error. Does one work better for small dataset sizes or large datasets?

asked Nov 14 '11 at 14:57

grant

1,531

131

votes

6 answers

How would you explain the difference between correlation and covariance?

Following up on this question, How would you explain covariance to someone who understands only the mean?, which addresses the issue of explaining covariance to a lay person, brought up a similar question in my mind. How would one explain to a…

asked Nov 08 '11 at 16:52

pmgjones

5,773
8
38
36

131

votes

5 answers

Using k-fold cross-validation for time-series model selection

Question: I want to be sure of something, is the use of k-fold cross-validation with time series is straightforward, or does one need to pay special attention before using it? Background: I'm modeling a time series of 6 year (with semi-markov…

asked Aug 10 '11 at 17:20

Mickaël S

1,468

130

votes

9 answers

Numerical example to understand Expectation-Maximization

I am trying to get a good grasp on the EM algorithm, to be able to implement and use it. I spent a full day reading the theory and a paper where EM is used to track an aircraft using the position information coming from a radar. Honestly, I don't…

asked Oct 14 '13 at 22:37

arjsgh21

2,633

130

votes

6 answers

Difference between confidence intervals and prediction intervals

For a prediction interval in linear regression you still use $\hat{E}[Y|x] = \hat{\beta_0}+\hat{\beta}_{1}x$ to generate the interval. You also use this to generate a confidence interval of $E[Y|x_0]$. What's the difference between the two?

asked Oct 04 '11 at 18:35

question

1,485

128

votes

22 answers

Most interesting statistical paradoxes

Because I find them fascinating, I'd like to hear what folks in this community find as the most interesting statistical paradox and why.

paradox

asked Feb 28 '12 at 04:08

Nick

3,537

128

votes

6 answers

What loss function for multi-class, multi-label classification tasks in neural networks?

I'm training a neural network to classify a set of objects into n-classes. Each object can belong to multiple classes at the same time (multi-class, multi-label). I read that for multi-class problems it is generally recommended to use softmax and…

asked Apr 17 '16 at 14:28

aKzenT

1,381

127

votes

5 answers

Are unbalanced datasets problematic, and (how) does oversampling (purport to) help?

TL;DR See title. Motivation I am hoping for a canonical answer along the lines of "(1) No, (2) Not applicable, because (1)", which we can use to close many wrong questions about unbalanced datasets and oversampling. I would be quite as happy to be…

asked Jul 16 '18 at 21:22

Stephan Kolassa

123,354

Most Popular