Most Popular

1500 questions
64
votes
5 answers

How to calculate pseudo-$R^2$ from R's logistic regression?

Christopher Manning's writeup on logistic regression in R shows a logistic regression in R as follows: ced.logr <- glm(ced.del ~ cat + follows + factor(class), family=binomial) Some output: > summary(ced.logr) Call: glm(formula = ced.del ~ cat +…
dfrankow
  • 3,376
64
votes
13 answers

Mean absolute deviation vs. standard deviation

In the text book "New Comprehensive Mathematics for O Level" by Greer (1983), I see averaged deviation calculated like this: Sum up absolute differences between single values and the mean. Then get its average. Througout the chapter the term mean…
itsols
  • 809
64
votes
3 answers

Clustering with K-Means and EM: how are they related?

I have studied algorithms for clustering data (unsupervised learning): EM, and k-means. I keep reading the following : k-means is a variant of EM, with the assumptions that clusters are spherical. Can somebody explain the above sentence? I do…
Myna
  • 793
  • 1
  • 6
  • 6
64
votes
3 answers

What is the difference between posterior and posterior predictive distribution?

I understand what a Posterior is, but I'm not sure what the latter means? How are the 2 different? Kevin P Murphy indicated in his textbook, Machine Learning: a Probabilistic Perspective, that it is "an internal belief state". What does that really…
A.D
  • 2,494
64
votes
2 answers

Optimal number of folds in $K$-fold cross-validation: is leave-one-out CV always the best choice?

Computing power considerations aside, are there any reasons to believe that increasing the number of folds in cross-validation leads to better model selection/validation (i.e. that the higher the number of folds the better)? Taking the argument to…
64
votes
7 answers

Which permutation test implementation in R to use instead of t-tests (paired and non-paired)?

I have data from an experiment that I analyzed using t-tests. The dependent variable is interval scaled and the data are either unpaired (i.e., 2 groups) or paired (i.e., within-subjects). E.g. (within subjects): x1 <- c(99, 99.5, 65, 100, 99,…
Henrik
  • 14,198
  • 11
  • 69
  • 130
64
votes
4 answers

How are regression, the t-test, and the ANOVA all versions of the general linear model?

How are they all versions of the same basic statistical method?
Amahabirsingh
  • 731
  • 1
  • 6
  • 5
64
votes
8 answers

Does it ever make sense to treat categorical data as continuous?

In answering this question on discrete and continuous data I glibly asserted that it rarely makes sense to treat categorical data as continuous. On the face of it that seems self-evident, but intuition is often a poor guide for statistics, or at…
walkytalky
  • 1,898
64
votes
47 answers

Most famous statisticians

What are the most important statisticians, and what is it that made them famous? (Reply just one scientist per answer please.)
64
votes
1 answer

Logistic regression in R resulted in perfect separation (Hauck-Donner phenomenon). Now what?

I'm trying to predict a binary outcome using 50 continuous explanatory variables (the range of most of the variables is $-\infty$ to $\infty$). My data set has almost 24,000 rows. When I run glm in R, I get: Warning messages: 1: glm.fit: algorithm…
Dcook
  • 773
64
votes
11 answers

Examples of Bayesian and frequentist approach giving different answers

Note: I am aware of philosophical differences between Bayesian and frequentist statistics. For example "what is the probability that the coin on the table is heads" doesn't make sense in frequentist statistics, since it has either already landed…
user541686
  • 1,185
64
votes
4 answers

Are all values within a 95% confidence interval equally likely?

I have found discordant information on the question: "If one constructs a 95% confidence interval (CI) of a difference in means or a difference in proportions, are all values within the CI equally likely? Or, is the point estimate the most likely,…
pmgjones
  • 5,773
  • 8
  • 38
  • 36
64
votes
8 answers

Are bayesians slaves of the likelihood function?

In his book "All of Statistics", Prof. Larry Wasserman presents the following Example (11.10, page 188). Suppose that we have a density $f$ such that $f(x)=c\,g(x)$, where $g$ is a known (nonnegative, integrable) function, and the normalization…
Zen
  • 24,121
64
votes
4 answers

Does the optimal number of trees in a random forest depend on the number of predictors?

Can someone explain why we need a large number of trees in random forest when the number of predictors is large? How can we determine the optimal number of trees?
Z Khan
  • 643
  • 1
  • 6
  • 4
64
votes
13 answers

Two-tailed tests... I'm just not convinced. What's the point?

The following excerpt is from the entry, What are the differences between one-tailed and two-tailed tests?, on UCLA's statistics help site. ... consider the consequences of missing an effect in the other direction. Imagine you have developed a new…