Highest Voted Questions - Statistical Analysis Stack Exchange

41

votes

12 answers

Open Source statistical textbooks?

There have been a few questions about statistical textbooks, such as the question Free statistical textbooks. However, I am looking for textbooks that are Open Source, for example, having an Creative Commons license. The reason is that in course…

asked Jul 25 '10 at 14:53

Egon Willighagen

176

41

votes

6 answers

If we know A is independent of B, why isn't P(A|B,C) = P(A|C) necessarily true?

Let's say we know that A is independent of B, or mathematically: $$P(A|B) = P(A)$$ Then how come we can't say the following is necessarily true: $$P(A|B,C) = P(A|C)$$ If the outcome of B doesn't have an effect on the outcome of $A$, then why would…

asked Oct 08 '22 at 03:41

Nova

699
5
6

41

votes

5 answers

Why is the exponential family so important in statistics?

Why is the exponential family so important in statistics? I was recently reading about the exponential family within statistics. As far as I understand, the exponential family refers to any probability distribution function that can be written in…

asked Nov 17 '21 at 01:23

stats_noob

1
3
32
105

41

votes

3 answers

What is the Wine/Water Paradox in Bayesian statistics, and what is its resolution?

I have just heard about the Wine/Water Paradox in Bayesian statistics, but didn't understand it very well (see Mikkelson 2004 for an introduction). Can you explain in simple terms what the paradox is (and why is it a paradox), why it matters for…

asked Mar 18 '21 at 22:14

user314217

41

votes

3 answers

Can AIC compare across different types of model?

I'm using AIC (Akaike's Information Criterion) to compare non-linear models in R. Is it valid to compare the AICs of different types of model? Specifically, I'm comparing a model fitted by glm versus a model with a random effect term fitted by glmer…

asked Nov 29 '10 at 16:08

Thomas K

513

41

votes

4 answers

Why are log probabilities useful?

Probabilities of a random variable's observations are in the range $[0,1]$, whereas log probabilities transform them to the log scale. What then is the corresponding range of log probabilities, i.e. what does a probability of 0 become, and is it the…

asked Aug 20 '20 at 14:18

develarist

3,917
1
21
52

41

votes

9 answers

What is the difference between an estimator and a statistic?

I learned that a statistic is an attribute you can obtain from samples.Taking many samples of same size, calculating this attribute for all of them and plotting the pdf, we get the distribution of the corresponding attribute or the distribution of…

asked Jan 15 '13 at 02:30

gutto

479

41

votes

6 answers

Least-angle regression vs. lasso

Least-angle regression and the lasso tend to produce very similar regularization paths (identical except when a coefficient crosses zero.) They both can be efficiently fit by virtually identical algorithms. Is there ever any practical reason to…

asked Nov 18 '10 at 07:28

NPE

5,581
6
37
45

41

votes

4 answers

Why squared residuals instead of absolute residuals in OLS estimation?

Why are we using the squared residuals instead of the absolute residuals in OLS estimation? My idea was that we use the square of the error values, so that residuals below the fitted line (which are then negative), would still have to be able to be…

asked Dec 16 '12 at 12:17

PascalVKooten

2,369

41

votes

3 answers

Kernel logistic regression vs SVM

As is known to all, SVM can use kernel method to project data points in higher spaces so that points can be separated by a linear space. But we can also use logistic regression to choose this boundary in the kernel space, so what's the advantages…

asked Nov 20 '12 at 02:31

FindBoat

811

41

votes

11 answers

Are there any good popular science book about statistics or machine learning?

There a bunch of really good popular science books around, that deal with real science, as well as the history and reasons behind current theories, while remaining extremely enjoyable to read. For example, "Chaos" by James Gleick (chaos, fractals,…

asked Nov 02 '12 at 04:12

naught101

5,453

41

votes

5 answers

How to Handle Many Times Series Simultaneously?

I have a data set including the demand of several products (1200 products) for 25 periods and I need to predict the demand of each product for the next period. At first, I wanted to use ARIMA and train a model for each product, but because of the…

asked Jun 10 '19 at 07:44

Amin

673
1
7
13

41

votes

3 answers

Does statistical independence mean lack of causation?

Two random variables A and B are statistically independent. That means that in the DAG of the process: $(A {\perp\!\!\!\perp} B)$ and of course $P(A|B)=P(A)$. But does that also mean that there's no front-door from B to A? Because then we should get…

asked Jul 15 '18 at 14:39

user1834069

633

41

votes

3 answers

What exactly is a seed in a random number generator?

I tried some usual google search etc. but most of the answers I find are either somewhat ambiguous or language/library specific such as Python or C++ stdlib.h etc. I am looking for a language agnostic, mathematical answer, not the specifics of a…

random-generation

asked Jul 04 '18 at 03:50

Della

533

41

votes

3 answers

Brain-teaser: What is the expected length of an iid sequence that is monotonically increasing when drawn from a uniform [0,1] distribution?

This is an interview question for a quantitative analyst position, reported here. Suppose we are drawing from a uniform $[0,1]$ distribution and the draws are iid, what is the expected length of a monotonically increasing distribution? I.e., we…

asked Jun 12 '18 at 00:42

Amazonian

1,534

Most Popular