Most Popular
1500 questions
64
votes
5 answers
How to calculate pseudo-$R^2$ from R's logistic regression?
Christopher Manning's writeup on logistic regression in R shows a logistic regression in R as follows:
ced.logr <- glm(ced.del ~ cat + follows + factor(class),
family=binomial)
Some output:
> summary(ced.logr)
Call:
glm(formula = ced.del ~ cat +…
dfrankow
- 3,376
64
votes
13 answers
Mean absolute deviation vs. standard deviation
In the text book "New Comprehensive Mathematics for O Level" by Greer (1983), I see averaged deviation calculated like this:
Sum up absolute differences between single values and the mean. Then
get its average. Througout the chapter the term mean…
itsols
- 809
64
votes
3 answers
Clustering with K-Means and EM: how are they related?
I have studied algorithms for clustering data (unsupervised learning): EM, and k-means.
I keep reading the following :
k-means is a variant of EM, with the assumptions that clusters are
spherical.
Can somebody explain the above sentence? I do…
Myna
- 793
- 1
- 6
- 6
64
votes
3 answers
What is the difference between posterior and posterior predictive distribution?
I understand what a Posterior is, but I'm not sure what the latter means?
How are the 2 different?
Kevin P Murphy indicated in his textbook, Machine Learning: a Probabilistic Perspective, that it is "an internal belief state". What does that really…
A.D
- 2,494
64
votes
2 answers
Optimal number of folds in $K$-fold cross-validation: is leave-one-out CV always the best choice?
Computing power considerations aside, are there any reasons to believe that increasing the number of folds in cross-validation leads to better model selection/validation (i.e. that the higher the number of folds the better)?
Taking the argument to…
Amelio Vazquez-Reina
- 19,346
64
votes
7 answers
Which permutation test implementation in R to use instead of t-tests (paired and non-paired)?
I have data from an experiment that I analyzed using t-tests. The dependent variable is interval scaled and the data are either unpaired (i.e., 2 groups) or paired (i.e., within-subjects).
E.g. (within subjects):
x1 <- c(99, 99.5, 65, 100, 99,…
Henrik
- 14,198
- 11
- 69
- 130
64
votes
4 answers
How are regression, the t-test, and the ANOVA all versions of the general linear model?
How are they all versions of the same basic statistical method?
Amahabirsingh
- 731
- 1
- 6
- 5
64
votes
8 answers
Does it ever make sense to treat categorical data as continuous?
In answering this question on discrete and continuous data I glibly asserted that it rarely makes sense to treat categorical data as continuous.
On the face of it that seems self-evident, but intuition is often a poor guide for statistics, or at…
walkytalky
- 1,898
64
votes
47 answers
Most famous statisticians
What are the most important statisticians, and what is it that made them famous?
(Reply just one scientist per answer please.)
mariana soffer
- 1,101
64
votes
1 answer
Logistic regression in R resulted in perfect separation (Hauck-Donner phenomenon). Now what?
I'm trying to predict a binary outcome using 50 continuous explanatory variables (the range of most of the variables is $-\infty$ to $\infty$). My data set has almost 24,000 rows. When I run glm in R, I get:
Warning messages:
1: glm.fit: algorithm…
Dcook
- 773
64
votes
11 answers
Examples of Bayesian and frequentist approach giving different answers
Note: I am aware of philosophical differences between Bayesian and frequentist statistics.
For example "what is the probability that the coin on the table is heads" doesn't make sense in frequentist statistics, since it has either already landed…
user541686
- 1,185
64
votes
4 answers
Are all values within a 95% confidence interval equally likely?
I have found discordant information on the question: "If one constructs a 95% confidence interval (CI) of a difference in means or a difference in proportions, are all values within the CI equally likely? Or, is the point estimate the most likely,…
pmgjones
- 5,773
- 8
- 38
- 36
64
votes
8 answers
Are bayesians slaves of the likelihood function?
In his book "All of Statistics", Prof. Larry Wasserman presents the following Example (11.10, page 188). Suppose that we have a density $f$ such that $f(x)=c\,g(x)$, where $g$ is a known (nonnegative, integrable) function, and the normalization…
Zen
- 24,121
64
votes
4 answers
Does the optimal number of trees in a random forest depend on the number of predictors?
Can someone explain why we need a large number of trees in random forest when the number of predictors is large? How can we determine the optimal number of trees?
Z Khan
- 643
- 1
- 6
- 4
64
votes
13 answers
Two-tailed tests... I'm just not convinced. What's the point?
The following excerpt is from the entry, What are the differences between one-tailed and two-tailed tests?, on UCLA's statistics help site.
... consider the consequences of missing an effect in the other direction. Imagine you have developed a new…
FromTheAshes
- 773