Most Popular

1500 questions
170
votes
9 answers

Objective function, cost function, loss function: are they the same thing?

In machine learning, people talk about objective function, cost function, loss function. Are they just different names of the same thing? When to use them? If they are not always refer to the same thing, what are the differences?
Bin
  • 1,829
170
votes
4 answers

Percentile vs quantile vs quartile

What is the difference between the three terms below? percentile quantile quartile
luciano
  • 14,269
170
votes
4 answers

When is R squared negative?

My understanding is that $R^2$ cannot be negative as it is the square of R. However I ran a simple linear regression in SPSS with a single independent variable and a dependent variable. My SPSS output give me a negative value for $R^2$. If I was to…
Anne
  • 2,227
169
votes
21 answers

Does Julia have any hope of sticking in the statistical community?

I recently read a post from R-Bloggers, that linked to this blog post from John Myles White about a new language called Julia. Julia takes advantage of a just-in-time compiler that gives it wicked fast run times and puts it on the same order of…
169
votes
3 answers

Gradient Boosting Tree vs Random Forest

Gradient tree boosting as proposed by Friedman uses decision trees as base learners. I'm wondering if we should make the base decision tree as complex as possible (fully grown) or simpler? Is there any explanation for the choice? Random Forest is…
FihopZz
  • 2,013
167
votes
34 answers

The Sleeping Beauty Paradox

The situation Some researchers would like to put you to sleep. Depending on the secret toss of a fair coin, they will briefly awaken you either once (Heads) or twice (Tails). After each waking, they will put you back to sleep with a drug that…
whuber
  • 322,774
165
votes
5 answers

What's the difference between principal component analysis and multidimensional scaling?

How are PCA and classical MDS different? How about MDS versus nonmetric MDS? Is there a time when you would prefer one over the other? How do the interpretations differ?
165
votes
5 answers

Why are p-values uniformly distributed under the null hypothesis?

Recently, I have found in a paper by Klammer, et al. a statement that p-values should be uniformly distributed. I believe the authors, but cannot understand why it is so. Klammer, A. A., Park, C. Y., and Stafford Noble, W. (2009) Statistical…
golobor
  • 1,683
162
votes
9 answers

Does causation imply correlation?

Correlation does not imply causation, as there could be many explanations for the correlation. But does causation imply correlation? Intuitively, I would think that the presence of causation means there is necessarily some correlation. But my…
Matthew
  • 1,723
161
votes
6 answers

Where should I place dropout layers in a neural network?

Is there any general guidelines on where to place dropout layers in a neural network?
Franck Dernoncourt
  • 46,817
  • 33
  • 176
  • 288
160
votes
5 answers

How to choose between Pearson and Spearman correlation?

How do I know when to choose between Spearman's $\rho$ and Pearson's $r$? My variable includes satisfaction and the scores were interpreted using the sum of the scores. However, these scores could also be ranked.
user3636
157
votes
2 answers

KL divergence between two univariate Gaussians

I need to determine the KL-divergence between two Gaussians. I am comparing my results to these, but I can't reproduce their result. My result is obviously wrong, because the KL is not 0 for KL(p, p). I wonder where I am doing a mistake and ask if…
bayerj
  • 13,773
156
votes
5 answers

Why normalize images by subtracting dataset's image mean, instead of the current image mean in deep learning?

There are some variations on how to normalize the images but most seem to use these two methods: Subtract the mean per channel calculated over all images (e.g. VGG_ILSVRC_16_layers) Subtract by pixel/channel calculated over all images (e.g. CNN_S,…
Max Gordon
  • 5,926
  • 8
  • 34
  • 52
155
votes
8 answers

Pearson's or Spearman's correlation with non-normal data

I get this question frequently enough in my statistics consulting work, that I thought I'd post it here. I have an answer, which is posted below, but I was keen to hear what others have to say. Question: If you have two variables that are not…
Jeromy Anglim
  • 44,984
155
votes
25 answers

R vs SAS, why is SAS preferred by private companies?

I learned R but it seems that companies are much more interested in SAS experience. What are the advantages of SAS over R?