Most Popular
1500 questions
99
votes
9 answers
How to compute precision/recall for multiclass-multilabel classification?
I'm wondering how to calculate precision and recall measures for multiclass multilabel classification, i.e. classification where there are more than two labels, and where each instance can have multiple labels?
Vam
- 1,325
99
votes
15 answers
What do you call an average that does not include outliers?
What do you call an average that does not include outliers?
For example if you have a set:
{90,89,92,91,5} avg = 73.4
but excluding the outlier (5) we have
{90,89,92,91(,5)} avg = 90.5
How do you describe this average in statistics?
Tawani
- 1,093
98
votes
10 answers
Is there a minimum sample size required for the t-test to be valid?
I'm currently working on a quasi-experimental research paper. I only have a sample size of 15 due to low population within the chosen area and that only 15 fit my criteria. Is 15 the minimum sample size to compute for t-test and F-test? If so, where…
98
votes
11 answers
How to obtain the p-value (check significance) of an effect in a lme4 mixed model?
I use lme4 in R to fit the mixed model
lmer(value~status+(1|experiment)))
where value is continuous, status and experiment are factors, and I get
Linear mixed model fit by REML
Formula: value ~ status + (1 | experiment)
AIC BIC logLik…
ECII
- 2,171
98
votes
9 answers
How should outliers be dealt with in linear regression analysis?
Often times a statistical analyst is handed a set dataset and asked to fit a model using a technique such as linear regression. Very frequently the dataset is accompanied with a disclaimer similar to "Oh yeah, we messed up collecting some of these…
Sharpie
- 4,374
98
votes
3 answers
Can someone explain Gibbs sampling in very simple words?
I'm doing some reading on topic modeling (with Latent Dirichlet Allocation) which makes use of Gibbs sampling. As a newbie in statistics―well, I know things like binomials, multinomials, priors, etc.―,I find it difficult to grasp how Gibbs sampling…
Thea
- 983
97
votes
2 answers
When to use regularization methods for regression?
In what circumstances should one consider using regularization methods (ridge, lasso or least angles regression) instead of OLS?
In case this helps steer the discussion, my main interest is improving predictive accuracy.
NPE
- 5,581
- 6
- 37
- 45
97
votes
7 answers
Why to optimize max log probability instead of probability
In most machine learning tasks where you can formulate some probability $p$ which should be maximised, we would actually optimize the log probability $\log p$ instead of the probability for some parameters $\theta$. E.g. in maximum likelihood…
Albert
- 1,245
97
votes
12 answers
Who Are The Bayesians?
As one becomes interested in statistics, the dichotomy "Frequentist" vs. "Bayesian" soon becomes commonplace (and who hasn't read Nate Silver's The Signal and the Noise, anyway?). In talks and introductory courses, the point of view is…
Antoni Parellada
- 26,280
96
votes
6 answers
Why does k-means clustering algorithm use only Euclidean distance metric?
Is there a specific purpose in terms of efficiency or functionality why the k-means algorithm does not use for example cosine (dis)similarity as a distance metric, but can only use the Euclidean norm? In general, will K-means method comply and be…
curious
- 1,101
- 1
- 8
- 7
96
votes
3 answers
What is "restricted maximum likelihood" and when should it be used?
I have read in the abstract of this paper that:
"The maximum likelihood (ML) procedure of Hartley aud Rao is modified by adapting a transformation from Patterson and Thompson which partitions the likelihood render normality into two parts, one…
Joe King
- 3,805
96
votes
5 answers
When to use Fisher versus Neyman-Pearson framework?
I've been reading a lot lately about the differences between Fisher's method of hypothesis testing and the Neyman-Pearson school of thought.
My question is, ignoring philosophical objections, when should we use Fisher's approach to data testing and…
Stijn
- 1,880
96
votes
2 answers
Given the power of computers these days, is there ever a reason to do a chi-squared test rather than Fisher's exact test?
Given that software can do the Fisher's exact test calculation so easily nowadays, is there any circumstance where, theoretically or practically, the chi-squared test is actually preferable to Fisher's exact test?
Advantages of the Fisher's exact…
pmgjones
- 5,773
- 8
- 38
- 36
96
votes
1 answer
When to use an offset in a Poisson regression?
Does anybody know why offset in a Poisson regression is used? What do you achieve by this?
MarkDollar
- 5,955
96
votes
5 answers
How to interpret an inverse covariance or precision matrix?
I was wondering whether anyone could point me to some references that discuss the interpretation of the elements of the inverse covariance matrix, also known as the concentration matrix or the precision matrix.
I have access to Cox and Wermuth's…
Vinh Nguyen
- 1,131