Highest Voted Questions - Statistical Analysis Stack Exchange

36

votes

1 answer

"Frequency" value for seconds/minutes intervals data in R

I'm using R(3.1.1), and ARIMA models for forecasting. I would like to know what should be the "frequency" parameter, which is assigned in the ts() function, if im using time series data which is: separated by minutes and is spread over 180 days…

asked Oct 20 '14 at 19:18

Apython

655

36

votes

3 answers

Generating data with a given sample covariance matrix

Given a covariance matrix $\boldsymbol \Sigma_s$, how to generate data such that it would have the sample covariance matrix $\hat{\boldsymbol \Sigma} = \boldsymbol \Sigma_s$? More generally: we are often interested in generating data from a density…

asked Oct 15 '14 at 17:35

Kees Mulder

1,674

36

votes

6 answers

How can I analytically prove that randomly dividing an amount results in an exponential distribution (of e.g. income and wealth)?

In this current article in SCIENCE the following is being proposed: Suppose you randomly divide 500 million in income among 10,000 people. There's only one way to give everyone an equal, 50,000 share. So if you're doling out earnings randomly,…

asked Sep 03 '14 at 07:12

vonjd

6,146

36

votes

8 answers

In Naive Bayes, why bother with Laplace smoothing when we have unknown words in the test set?

I was reading over Naive Bayes Classification today. I read, under the heading of Parameter Estimation with add 1 smoothing: Let $c$ refer to a class (such as Positive or Negative), and let $w$ refer to a token or word. The maximum likelihood…

asked Jul 22 '14 at 04:29

tumultous_rooster

1,195

36

votes

1 answer

Relation between variational Bayes and EM

I read somewhere that Variational Bayes method is a generalization of the EM algorithm. Indeed, the iterative parts of the algorithms are very similar. In order to test whether the EM algorithm is a special version of the Variational Bayes, I tried…

asked Jul 03 '14 at 12:44

Ufuk Can Bicici

2,298

36

votes

3 answers

Asymptotic distribution of sample variance of non-normal sample

This is a more general treatment of the issue posed by this question. After deriving the asymptotic distribution of the sample variance, we can apply the Delta method to arrive at the corresponding distribution for the standard deviation. Let a…

asked Jul 01 '14 at 00:38

Alecos Papadopoulos

58,953

35

votes

6 answers

Difference between Bayes network, neural network, decision tree and Petri nets

What is the difference between neural network, Bayesian network, decision tree and Petri nets, even though they are all graphical models and visually depict cause-effect relationship.

asked Apr 21 '14 at 04:16

Ria George

1,465
2
17
31

35

votes

7 answers

Birthday paradox with a (huge) twist: Probability of sharing exact same date of birth with partner?

I share the same birthdate as my boyfriend, same date but also same year, our births are seperated by merely 5 hours or so. I know that the chances of meeting someone who was born on the same date than me is fairly high and I know a few people with…

asked Mar 11 '14 at 17:26

curious

517

35

votes

4 answers

Intuitive reasoning behind biased maximum likelihood estimators

I have a confusion on biased maximum likelihood (ML) estimators. The mathematics of the whole concept is pretty clear to me but I cannot figure out the intuitive reasoning behind it. Given a certain dataset which has samples from a distribution,…

asked Mar 04 '14 at 22:51

ssah

451

35

votes

4 answers

Can ANOVA be significant when none of the pairwise t-tests is?

Is it possible for one-way (with $N>2$ groups, or "levels") ANOVA to report a significant difference when none of the $N(N-1)/2$ pairwise t-tests does? In this answer @whuber wrote: It is well known that a global ANOVA F test can detect a…

asked Jan 22 '14 at 16:25

amoeba

104,745

35

votes

6 answers

Algorithm to dynamically monitor quantiles

I want to estimate the quantile of some data. The data are so huge that they can not be accommodated in the memory. And data are not static, new data keep coming. Does anyone know any algorithm to monitor the quantiles of the data observed so far…

asked Mar 07 '11 at 15:53

sinoTrinity

481

35

votes

4 answers

Why are Jeffreys priors considered noninformative?

Consider a Jeffreys prior where $p(\theta) \propto \sqrt{|i(\theta)|}$, where $i$ is the Fisher information. I keep seeing this prior being mentioned as a uninformative prior, but I never saw an argument why it is uninformative. After all, it is not…

asked Feb 22 '11 at 23:01

bayesian

869

35

votes

4 answers

Why not report the mean of a bootstrap distribution?

When one bootstraps a parameter to get the standard error we get a distribution of the parameter. Why don't we use the mean of that distribution as a result or estimate for the parameter we are trying to get? Shouldn't the distribution approximate…

asked Sep 28 '13 at 22:32

Guillermo Perez

451

35

votes

4 answers

Generating random variables from a mixture of Normal distributions

How can I sample from a mixture distribution, and in particular a mixture of Normal distributions in R? For example, if I wanted to sample from: $$ 0.3\!\times\mathcal{N}(0,1)\; + \;0.5\!\times\mathcal{N}(10,1)\; +…

asked Sep 24 '13 at 01:09

user30490

35

votes

5 answers

Data "exploration" vs data "snooping"/"torturing"?

Many times I have come across informal warnings against "data snooping" (here's one amusing example), and I think I have an intuitive idea of roughly what that means, and why it may be a problem. On the other hand, "exploratory data analysis" seems…

asked Sep 16 '13 at 15:36

kjo

1,967

Most Popular