Most Popular
1500 questions
289
votes
13 answers
Is there any reason to prefer the AIC or BIC over the other?
The AIC and BIC are both methods of assessing model fit penalized for the number of estimated parameters. As I understand it, BIC penalizes models more for free parameters than does AIC. Beyond a preference based on the stringency of the criteria,…
russellpierce
- 18,599
289
votes
7 answers
What does AUC stand for and what is it?
Searched high and low and have not been able to find out what AUC, as in related to prediction, stands for or means.
josh
- 3,249
288
votes
6 answers
Is $R^2$ useful or dangerous?
I was skimming through some lecture notes by Cosma Shalizi (in particular, section 2.1.1 of the second lecture), and was reminded that you can get very low $R^2$ even when you have a completely linear model.
To paraphrase Shalizi's example: suppose…
raegtin
- 9,930
287
votes
3 answers
How to know that your machine learning problem is hopeless?
Imagine a standard machine-learning scenario:
You are confronted with a large multivariate dataset and you have a
pretty blurry understanding of it. What you need to do is to make
predictions about some variable based on what you have. As…
Tim
- 138,066
282
votes
153 answers
Famous statistical quotations
What is your favorite statistical quote?
This is community wiki, so please one quote per answer.
robin girard
- 6,705
282
votes
11 answers
How would you explain covariance to someone who understands only the mean?
...assuming that I'm able to augment their knowledge about variance in an intuitive fashion ( Understanding "variance" intuitively ) or by saying: It's the average distance of the data values from the 'mean' - and since variance is in square units,…
PhD
- 14,627
280
votes
2 answers
Interpretation of R's lm() output
The help pages in R assume I know what those numbers mean, but I don't.
I'm trying to really intuitively understand every number here. I will just post the output and comment on what I found out. There might (will) be mistakes, as I'll just write…
Alexander Engelhardt
- 4,291
277
votes
12 answers
How would you explain Markov Chain Monte Carlo (MCMC) to a layperson?
Maybe the concept, why it's used, and an example.
Neil McGuigan
- 9,872
265
votes
15 answers
What are the differences between Factor Analysis and Principal Component Analysis?
It seems that a number of the statistical packages that I use wrap these two concepts together. However, I'm wondering if there are different assumptions or data 'formalities' that must be true to use one over the other. A real example would be…
Brandon Bertelsen
- 7,232
- 9
- 41
- 48
260
votes
6 answers
ROC vs precision-and-recall curves
I understand the formal differences between them, what I want to know is when it is more relevant to use one vs. the other.
Do they always provide complementary insight about the performance of a given classification/detection system?
When is it…
Amelio Vazquez-Reina
- 19,346
256
votes
46 answers
What are common statistical sins?
I'm a grad student in psychology, and as I pursue more and more independent studies in statistics, I am increasingly amazed by the inadequacy of my formal training. Both personal and second hand experience suggests that the paucity of statistical…
Mike Lawrence
- 13,793
254
votes
8 answers
Algorithms for automatic model selection
I would like to implement an algorithm for automatic model selection.
I am thinking of doing stepwise regression but anything will do (it has to be based on linear regressions though).
My problem is that I am unable to find a methodology, or an…
S4M
- 2,708
- 3
- 15
- 6
251
votes
9 answers
Why is Newton's method not widely used in machine learning?
This is something that has been bugging me for a while, and I couldn't find any satisfactory answers online, so here goes:
After reviewing a set of lectures on convex optimization, Newton's method seems to be a far superior algorithm than gradient…
Fei Yang
- 2,501
247
votes
38 answers
What is the best introductory Bayesian statistics textbook?
Which is the best introductory textbook for Bayesian statistics?
One book per answer, please.
Shane
- 12,461
245
votes
4 answers
How to interpret a QQ plot?
I am working with a small dataset (21 observations) and have the following normal QQ plot in R:
Seeing that the plot does not support normality, what could I infer about the underlying distribution? It seems to me that a distribution more skewed…
JohnK
- 20,366