Most Popular

1500 questions
115
votes
21 answers

What's a real-world example of "overfitting"?

I kind of understand what "overfitting" means, but I need help as to how to come up with a real-world example that applies to overfitting.
115
votes
11 answers

"Best" series of colors to use for differentiating series in publication-quality plots

Has any study been done on what are the best set of colors to use for showing multiple series on the same plot? I've just been using the defaults in matplotlib, and they look a little childish since they're all bright, primary colors.
114
votes
4 answers

How does the correlation coefficient differ from regression slope?

I would have expected the correlation coefficient to be the same as a regression slope (beta), however having just compared the two, they are different. How do they differ - what different information do they give?
luciano
  • 14,269
114
votes
2 answers

What is an embedding layer in a neural network?

In many neural network libraries, there are 'embedding layers', like in Keras or Lasagne. I am not sure I understand its function, despite reading the documentation. For example, in the Keras documentation it says: Turn positive integers (indexes)…
Francesco
  • 1,243
114
votes
7 answers

What is the difference between a multiclass and a multilabel problem?

What is the difference between a multiclass problem and a multilabel problem?
Learner
  • 4,457
113
votes
4 answers

What is the difference between Cross-entropy and KL divergence?

Both the cross-entropy and the KL divergence are tools to measure the distance between two probability distributions, but what is the difference between them? $$ H(P,Q) = -\sum_x P(x)\log Q(x) $$ $$ KL(P | Q) = \sum_{x} P(x)\log {\frac{P(x)}{Q(x)}}…
maso
  • 1,359
111
votes
5 answers

Diagnostic plots for count regression

What diagnostic plots (and perhaps formal tests) do you find most informative for regressions where the outcome is a count variable? I'm especially interested in Poisson and negative binomial models, as well as zero-inflated and hurdle counterparts…
half-pass
  • 3,740
111
votes
3 answers

Feature selection and cross-validation

I have recently been reading a lot on this site (@Aniko, @Dikran Marsupial, @Erik) and elsewhere about the problem of overfitting occuring with cross validation - (Smialowski et al 2010 Bioinformatics, Hastie, Elements of statistical learning). The…
BGreene
  • 3,283
111
votes
6 answers

On the importance of the i.i.d. assumption in statistical learning

In statistical learning, implicitly or explicitly, one always assumes that the training set $\mathcal{D} = \{ \bf {X}, \bf{y} \}$ is composed of $N$ input/response tuples $({\bf{X}}_i,y_i)$ that are independently drawn from the same joint…
Quantuple
  • 1,546
111
votes
4 answers

Relationship between poisson and exponential distribution

The waiting times for poisson distribution is an exponential distribution with parameter lambda. But I don't understand it. Poisson models the number of arrivals per unit of time for example. How is this related to exponential distribution? Lets say…
user862
  • 2,749
110
votes
3 answers

Subscript notation in expectations

What is the exact meaning of the subscript notation $\mathbb{E}_X[f(X)]$ in conditional expectations in the framework of measure theory ? These subscripts do not appear in the definition of conditional expectation, but we may see for example in this…
Emile
  • 3,460
110
votes
2 answers

What is covariance in plain language?

What is covariance in plain language and how is it linked to the terms dependence, correlation and variance-covariance structure with respect to repeated-measures designs?
abc
  • 1,811
110
votes
11 answers

What is the best way to remember the difference between sensitivity, specificity, precision, accuracy, and recall?

Despite having seen these terms 502847894789 times, I cannot for the life of me remember the difference between sensitivity, specificity, precision, accuracy, and recall. They're pretty simple concepts, but the names are highly unintuitive to me,…
Jessica
  • 2,091
109
votes
7 answers

Detecting a given face in a database of facial images

I'm working on a little project involving the faces of twitter users via their profile pictures. A problem I've encountered is that after I filter out all but the images that are clear portrait photos, a small but significant percentage of twitter…
ʞɔıu
  • 1,117
109
votes
10 answers

What is meant by a "random variable"?

What do they mean when they say "random variable"?
Baltimark
  • 2,268