Most Popular

1500 questions
53
votes
8 answers

When conducting a t-test why would one prefer to assume (or test for) equal variances rather than always use a Welch approximation of the df?

It seems like when the assumption of homogeneity of variance is met that the results from a Welch adjusted t-test and a standard t-test are approximately the same. Why not simply always use the Welch adjusted t?
russellpierce
  • 18,599
53
votes
7 answers

Why would someone use a Bayesian approach with a 'noninformative' improper prior instead of the classical approach?

If the interest is merely estimating the parameters of a model (pointwise and/or interval estimation) and the prior information is not reliable, weak, (I know this is a bit vague but I am trying to establish an scenario where the choice of a prior…
user10525
53
votes
2 answers

How do you do bootstrapping with time series data?

I recently learned about using bootstrapping techniques to calculate standard errors and confidence intervals for estimators. What I learned was that if the data is IID, you can treat the sample data as the population, and do sampling with…
statnub
  • 841
53
votes
6 answers

What book is recommendable to start learning statistics using R at the same time?

Books to Learn Statistics using R What exactly is the book I'm looking for. What I am looking for is a book that teaches you statistics while using R to give you hands-on experience and thus end up helping you learn R together. I've seen on amazon…
53
votes
5 answers

What references should be cited to support using 30 as a large enough sample size?

I have read/heard many times that the sample size of at least 30 units is considered as "large sample" (normality assumptions of means usually approximately holds due to the CLT, ...). Therefore, in my experiments, I usually generate samples of 30…
Lan
  • 1,409
53
votes
7 answers

Can a deep neural network approximate multiplication function?

Let's say we want to do regression for simple f = x * y using standard fully-connected deep neural network. The network takes x and y as input, and should learn to output x * y. I remember that there is research proving that NN with one hidden layer…
Brans Ds
  • 1,478
53
votes
16 answers

Most confusing statistical terms

We statisticians use many words in ways that are slightly different from the way everyone else uses them. This cause lots of problems when we teach or explain what we are doing. I'll start a list (and now I'll add some definitions, per…
Peter Flom
  • 119,535
  • 36
  • 175
  • 383
53
votes
5 answers

How to make a time series stationary?

Besides taking differences, what are other techniques for making a non-stationary time series, stationary? Ordinarily one refers to a series as "integrated of order p" if it can be made stationary through a lag operator $(1-L)^P X_t$.
Shane
  • 12,461
53
votes
5 answers

How does rectilinear activation function solve the vanishing gradient problem in neural networks?

I found rectified linear unit (ReLU) praised at several places as a solution to the vanishing gradient problem for neural networks. That is, one uses max(0,x) as activation function. When the activation is positive, it is obvious that this is better…
53
votes
9 answers

Are all models useless? Is any exact model possible -- or useful?

This question has been festering in my mind for over a month. The February 2015 issue of Amstat News contains an article by Berkeley Professor Mark van der Laan that scolds people for using inexact models. He states that by using models, statistics…
Russ Lenth
  • 20,271
53
votes
4 answers

Why does inversion of a covariance matrix yield partial correlations between random variables?

I heard that partial correlations between random variables can be found by inverting the covariance matrix and taking appropriate cells from such resulting precision matrix (this fact is mentioned in http://en.wikipedia.org/wiki/Partial_correlation,…
michal
  • 1,288
53
votes
6 answers

Is a time series the same as a stochastic process?

A stochastic process is a process that evolves over time, so is it really a fancier way of saying "time series"?
Victor
  • 6,565
53
votes
3 answers

Data APIs/feeds available as packages in R

EDIT: The Web Technologies and Services CRAN task view contains a much more comprehensive list of data sources and APIs available in R. You can submit a pull request on github if you wish to add a package to the task view. I'm making a list of the…
Zach
  • 23,766
53
votes
8 answers

How can I test if given samples are taken from a Poisson distribution?

I know of normality tests, but how do I test for "Poisson-ness"? I have sample of ~1000 non-negative integers, which I suspect are taken from a Poisson distribution, and I would like to test that.
David B
  • 1,321
  • 3
  • 13
  • 15
53
votes
6 answers

Eliciting priors from experts

How should I elicit prior distributions from experts when fitting a Bayesian model?
csgillespie
  • 13,029