Most Popular
1500 questions
390
votes
12 answers
Difference between logit and probit models
What is the difference between Logit and Probit model?
I'm more interested here in knowing when to use logistic regression, and when to use Probit.
If there is any literature which defines it using R, that would be helpful as well.
Beta
- 6,334
386
votes
5 answers
What is the trade-off between batch size and number of iterations to train a neural network?
When training a neural network, what difference does it make to set:
batch size to $a$ and number of iterations to $b$
vs. batch size to $c$ and number of iterations to $d$
where $ ab = cd $?
To put it otherwise, assuming that we train the neural…
Franck Dernoncourt
- 46,817
- 33
- 176
- 288
382
votes
84 answers
What is your favorite "data analysis" cartoon?
Data analysis cartoons can be useful for many reasons: they help communicate; they show that quantitative people have a sense of humor too; they can instigate good teaching moments; and they can help us remember important principles and…
Shane
- 12,461
378
votes
26 answers
Python as a statistics workbench
Lots of people use a main tool like Excel or another spreadsheet, SPSS, Stata, or R for their statistics needs. They might turn to some specific package for very special needs, but a lot of things can be done with a simple spreadsheet or a general…
Fabian Fagerholm
- 215
- 3
- 6
- 7
376
votes
15 answers
Is normality testing 'essentially useless'?
A former colleague once argued to me as follows:
We usually apply normality tests to the results of processes that,
under the null, generate random variables that are only
asymptotically or nearly normal (with the 'asymptotically' part…
shabbychef
- 14,814
361
votes
9 answers
What should I do when my neural network doesn't learn?
I'm training a neural network but the training loss doesn't decrease. How can I fix this?
I'm not asking about overfitting or regularization. I'm asking about how to solve the problem where my network's performance doesn't improve on the training…
Sycorax
- 90,934
355
votes
8 answers
Why is Euclidean distance not a good metric in high dimensions?
I read that 'Euclidean distance is not a good distance in high dimensions'. I guess this statement has something to do with the curse of dimensionality, but what exactly? Besides, what is 'high dimensions'? I have been applying hierarchical…
teaLeef
- 3,767
341
votes
13 answers
How to understand degrees of freedom?
From Wikipedia, there are three interpretations of the degrees of freedom of a statistic:
In statistics, the number of degrees of freedom is the number of
values in the final calculation of a statistic that are free to vary.
Estimates of…
Tim
- 19,445
323
votes
10 answers
What's the difference between a confidence interval and a credible interval?
Joris and Srikant's exchange here got me wondering (again) if my internal explanations for the difference between confidence intervals and credible intervals were the correct ones. How you would explain the difference?
Matt Parker
- 6,017
304
votes
15 answers
Why does a 95% Confidence Interval (CI) not imply a 95% chance of containing the mean?
It seems that through various related questions here, there is consensus that the "95%" part of what we call a "95% confidence interval" refers to the fact that if we were to exactly replicate our sampling and CI-computation procedures many times,…
Mike Lawrence
- 13,793
302
votes
6 answers
What is batch size in neural network?
I'm using Python Keras package for neural network. This is the link. Is batch_size equals to number of test samples? From Wikipedia we have this information:
However, in other cases, evaluating the sum-gradient may require
expensive evaluations…
user2991243
- 4,251
301
votes
8 answers
Bagging, boosting and stacking in machine learning
What's the similarities and differences between these 3 methods:
Bagging,
Boosting,
Stacking?
Which is the best one? And why?
Can you give me an example for each?
Bucsa Lucian
- 3,119
295
votes
15 answers
What is the meaning of p values and t values in statistical tests?
After taking a statistics course and then trying to help fellow students, I noticed one subject that inspires much head-desk banging is interpreting the results of statistical hypothesis tests. It seems that students easily learn how to perform the…
Sharpie
- 4,374
292
votes
8 answers
How to choose a predictive model after k-fold cross-validation?
I am wondering how to choose a predictive model after doing K-fold cross-validation.
This may be awkwardly phrased, so let me explain in more detail: whenever I run K-fold cross-validation, I use K subsets of the training data, and end up with K…
Berk U.
- 5,025
292
votes
11 answers
What exactly are keys, queries, and values in attention mechanisms?
How should one understand the keys, queries, and values that are often mentioned in attention mechanisms?
I've tried searching online, but all the resources I find only speak of them as if the reader already knows what they are.
Judging by the paper…
Sean
- 3,887