Highest Voted Questions - Statistical Analysis Stack Exchange

99

votes

9 answers

How to compute precision/recall for multiclass-multilabel classification?

I'm wondering how to calculate precision and recall measures for multiclass multilabel classification, i.e. classification where there are more than two labels, and where each instance can have multiple labels?

asked Jan 23 '12 at 12:54

Vam

1,325

99

votes

15 answers

What do you call an average that does not include outliers?

What do you call an average that does not include outliers? For example if you have a set: {90,89,92,91,5} avg = 73.4 but excluding the outlier (5) we have {90,89,92,91(,5)} avg = 90.5 How do you describe this average in statistics?

asked Feb 02 '09 at 14:21

Tawani

1,093

98

votes

10 answers

Is there a minimum sample size required for the t-test to be valid?

I'm currently working on a quasi-experimental research paper. I only have a sample size of 15 due to low population within the chosen area and that only 15 fit my criteria. Is 15 the minimum sample size to compute for t-test and F-test? If so, where…

asked Sep 25 '12 at 23:42

Czarina Francoise

981

98

votes

11 answers

How to obtain the p-value (check significance) of an effect in a lme4 mixed model?

I use lme4 in R to fit the mixed model lmer(value~status+(1|experiment))) where value is continuous, status and experiment are factors, and I get Linear mixed model fit by REML Formula: value ~ status + (1 | experiment) AIC BIC logLik…

asked Feb 16 '12 at 19:02

ECII

2,171

98

votes

9 answers

How should outliers be dealt with in linear regression analysis?

Often times a statistical analyst is handed a set dataset and asked to fit a model using a technique such as linear regression. Very frequently the dataset is accompanied with a disclaimer similar to "Oh yeah, we messed up collecting some of these…

asked Jul 19 '10 at 23:39

Sharpie

4,374

98

votes

3 answers

Can someone explain Gibbs sampling in very simple words?

I'm doing some reading on topic modeling (with Latent Dirichlet Allocation) which makes use of Gibbs sampling. As a newbie in statistics―well, I know things like binomials, multinomials, priors, etc.―,I find it difficult to grasp how Gibbs sampling…

asked May 01 '11 at 19:37

Thea

983

97

votes

2 answers

When to use regularization methods for regression?

In what circumstances should one consider using regularization methods (ridge, lasso or least angles regression) instead of OLS? In case this helps steer the discussion, my main interest is improving predictive accuracy.

asked Nov 06 '10 at 17:53

NPE

5,581
6
37
45

97

votes

7 answers

Why to optimize max log probability instead of probability

In most machine learning tasks where you can formulate some probability $p$ which should be maximised, we would actually optimize the log probability $\log p$ instead of the probability for some parameters $\theta$. E.g. in maximum likelihood…

asked Sep 28 '15 at 08:37

Albert

1,245

97

votes

12 answers

Who Are The Bayesians?

As one becomes interested in statistics, the dichotomy "Frequentist" vs. "Bayesian" soon becomes commonplace (and who hasn't read Nate Silver's The Signal and the Noise, anyway?). In talks and introductory courses, the point of view is…

asked Aug 13 '15 at 18:11

Antoni Parellada

26,280

96

votes

6 answers

Why does k-means clustering algorithm use only Euclidean distance metric?

Is there a specific purpose in terms of efficiency or functionality why the k-means algorithm does not use for example cosine (dis)similarity as a distance metric, but can only use the Euclidean norm? In general, will K-means method comply and be…

asked Jan 07 '14 at 11:53

curious

1,101
1
8
7

96

votes

3 answers

What is "restricted maximum likelihood" and when should it be used?

I have read in the abstract of this paper that: "The maximum likelihood (ML) procedure of Hartley aud Rao is modified by adapting a transformation from Patterson and Thompson which partitions the likelihood render normality into two parts, one…

asked Jan 28 '13 at 08:05

Joe King

3,805

96

votes

5 answers

When to use Fisher versus Neyman-Pearson framework?

I've been reading a lot lately about the differences between Fisher's method of hypothesis testing and the Neyman-Pearson school of thought. My question is, ignoring philosophical objections, when should we use Fisher's approach to data testing and…

asked Feb 20 '12 at 11:02

Stijn

1,880

96

votes

2 answers

Given the power of computers these days, is there ever a reason to do a chi-squared test rather than Fisher's exact test?

Given that software can do the Fisher's exact test calculation so easily nowadays, is there any circumstance where, theoretically or practically, the chi-squared test is actually preferable to Fisher's exact test? Advantages of the Fisher's exact…

asked Aug 13 '11 at 18:55

pmgjones

5,773
8
38
36

96

votes

1 answer

When to use an offset in a Poisson regression?

Does anybody know why offset in a Poisson regression is used? What do you achieve by this?

asked May 24 '11 at 08:12

MarkDollar

5,955

96

votes

5 answers

How to interpret an inverse covariance or precision matrix?

I was wondering whether anyone could point me to some references that discuss the interpretation of the elements of the inverse covariance matrix, also known as the concentration matrix or the precision matrix. I have access to Cox and Wermuth's…

asked May 14 '11 at 01:13

Vinh Nguyen

1,131

Most Popular