Most Popular

1500 questions
133
votes
29 answers

Free statistical textbooks

Are there any free statistical textbooks available?
csgillespie
  • 13,029
132
votes
4 answers

What does a "closed-form solution" mean?

I have come across the term "closed-form solution" quite often. What does a closed-form solution mean? How does one determine if a close-form solution exists for a given problem? Searching online, I found some information, but nothing in the context…
arjsgh21
  • 2,633
132
votes
14 answers

Maximum Likelihood Estimation (MLE) in layman terms

Could anyone explain to me in detail about maximum likelihood estimation (MLE) in layman's terms? I would like to know the underlying concept before going into mathematical derivation or equation.
StatsUser
  • 1,809
131
votes
4 answers

When to use gamma GLMs?

The gamma distribution can take on a pretty wide range of shapes, and given the link between the mean and the variance through its two parameters, it seems suited to dealing with heteroskedasticity in non-negative data, in a way that log-transformed…
generic_user
  • 13,339
131
votes
5 answers

What are the main differences between K-means and K-nearest neighbours?

I know that k-means is unsupervised and is used for clustering etc and that k-NN is supervised. But I wanted to know concrete differences between the two?
nsc010
  • 1,657
131
votes
4 answers

Softmax vs Sigmoid function in Logistic classifier?

What decides the choice of function ( Softmax vs Sigmoid ) in a Logistic classifier ? Suppose there are 4 output classes . Each of the above function gives the probabilities of each class being the correct output . So which one to take for a…
mach
  • 1,805
131
votes
4 answers

PCA and proportion of variance explained

In general, what is meant by saying that the fraction $x$ of the variance in an analysis like PCA is explained by the first principal component? Can someone explain this intuitively but also give a precise mathematical definition of what "variance…
user9097
  • 3,263
131
votes
4 answers

Differences between cross validation and bootstrapping to estimate the prediction error

I would like your thoughts about the differences between cross validation and bootstrapping to estimate the prediction error. Does one work better for small dataset sizes or large datasets?
grant
  • 1,531
131
votes
6 answers

How would you explain the difference between correlation and covariance?

Following up on this question, How would you explain covariance to someone who understands only the mean?, which addresses the issue of explaining covariance to a lay person, brought up a similar question in my mind. How would one explain to a…
pmgjones
  • 5,773
  • 8
  • 38
  • 36
131
votes
5 answers

Using k-fold cross-validation for time-series model selection

Question: I want to be sure of something, is the use of k-fold cross-validation with time series is straightforward, or does one need to pay special attention before using it? Background: I'm modeling a time series of 6 year (with semi-markov…
Mickaël S
  • 1,468
130
votes
9 answers

Numerical example to understand Expectation-Maximization

I am trying to get a good grasp on the EM algorithm, to be able to implement and use it. I spent a full day reading the theory and a paper where EM is used to track an aircraft using the position information coming from a radar. Honestly, I don't…
arjsgh21
  • 2,633
130
votes
6 answers

Difference between confidence intervals and prediction intervals

For a prediction interval in linear regression you still use $\hat{E}[Y|x] = \hat{\beta_0}+\hat{\beta}_{1}x$ to generate the interval. You also use this to generate a confidence interval of $E[Y|x_0]$. What's the difference between the two?
question
  • 1,485
128
votes
22 answers

Most interesting statistical paradoxes

Because I find them fascinating, I'd like to hear what folks in this community find as the most interesting statistical paradox and why.
Nick
  • 3,537
128
votes
6 answers

What loss function for multi-class, multi-label classification tasks in neural networks?

I'm training a neural network to classify a set of objects into n-classes. Each object can belong to multiple classes at the same time (multi-class, multi-label). I read that for multi-class problems it is generally recommended to use softmax and…
aKzenT
  • 1,381
127
votes
5 answers

Are unbalanced datasets problematic, and (how) does oversampling (purport to) help?

TL;DR See title. Motivation I am hoping for a canonical answer along the lines of "(1) No, (2) Not applicable, because (1)", which we can use to close many wrong questions about unbalanced datasets and oversampling. I would be quite as happy to be…
Stephan Kolassa
  • 123,354