Most Popular
1500 questions
36
votes
1 answer
"Frequency" value for seconds/minutes intervals data in R
I'm using R(3.1.1), and ARIMA models for forecasting.
I would like to know what should be the "frequency" parameter, which is assigned in the ts() function, if im using time series data which is:
separated by minutes and is spread over 180 days…
Apython
- 655
36
votes
3 answers
Generating data with a given sample covariance matrix
Given a covariance matrix $\boldsymbol \Sigma_s$, how to generate data such that it would have the sample covariance matrix $\hat{\boldsymbol \Sigma} = \boldsymbol \Sigma_s$?
More generally: we are often interested in generating data from a density…
Kees Mulder
- 1,674
36
votes
6 answers
How can I analytically prove that randomly dividing an amount results in an exponential distribution (of e.g. income and wealth)?
In this current article in SCIENCE the following is being proposed:
Suppose you randomly divide 500 million in income among 10,000
people. There's only one way to give everyone an equal, 50,000 share.
So if you're doling out earnings randomly,…
vonjd
- 6,146
36
votes
8 answers
In Naive Bayes, why bother with Laplace smoothing when we have unknown words in the test set?
I was reading over Naive Bayes Classification today. I read, under the heading of Parameter Estimation with add 1 smoothing:
Let $c$ refer to a class (such as Positive or Negative), and let $w$ refer to a token or word.
The maximum likelihood…
tumultous_rooster
- 1,195
36
votes
1 answer
Relation between variational Bayes and EM
I read somewhere that Variational Bayes method is a generalization of the EM algorithm. Indeed, the iterative parts of the algorithms are very similar. In order to test whether the EM algorithm is a special version of the Variational Bayes, I tried…
Ufuk Can Bicici
- 2,298
36
votes
3 answers
Asymptotic distribution of sample variance of non-normal sample
This is a more general treatment of the issue posed by this question.
After deriving the asymptotic distribution of the sample variance, we can apply the Delta method to arrive at the corresponding distribution for the standard deviation.
Let a…
Alecos Papadopoulos
- 58,953
35
votes
6 answers
Difference between Bayes network, neural network, decision tree and Petri nets
What is the difference between neural network, Bayesian network, decision tree and Petri nets, even though they are all graphical models and visually depict cause-effect relationship.
Ria George
- 1,465
- 2
- 17
- 31
35
votes
7 answers
Birthday paradox with a (huge) twist: Probability of sharing exact same date of birth with partner?
I share the same birthdate as my boyfriend, same date but also same year, our births are seperated by merely 5 hours or so.
I know that the chances of meeting someone who was born on the same date than me is fairly high and I know a few people with…
curious
- 517
35
votes
4 answers
Intuitive reasoning behind biased maximum likelihood estimators
I have a confusion on biased maximum likelihood (ML) estimators. The mathematics of the whole concept is pretty clear to me but I cannot figure out the intuitive reasoning behind it.
Given a certain dataset which has samples from a distribution,…
ssah
- 451
35
votes
4 answers
Can ANOVA be significant when none of the pairwise t-tests is?
Is it possible for one-way (with $N>2$ groups, or "levels") ANOVA to report a significant difference when none of the $N(N-1)/2$ pairwise t-tests does?
In this answer @whuber wrote:
It is well known that a global ANOVA F test can detect a…
amoeba
- 104,745
35
votes
6 answers
Algorithm to dynamically monitor quantiles
I want to estimate the quantile of some data. The data are so huge that they can not be accommodated in the memory. And data are not static, new data keep coming. Does anyone know any algorithm to monitor the quantiles of the data observed so far…
sinoTrinity
- 481
35
votes
4 answers
Why are Jeffreys priors considered noninformative?
Consider a Jeffreys prior where $p(\theta) \propto \sqrt{|i(\theta)|}$, where $i$ is the Fisher information.
I keep seeing this prior being mentioned as a uninformative prior, but I never saw an argument why it is uninformative. After all, it is not…
bayesian
- 869
35
votes
4 answers
Why not report the mean of a bootstrap distribution?
When one bootstraps a parameter to get the standard error we get a distribution of the parameter. Why don't we use the mean of that distribution as a result or estimate for the parameter we are trying to get? Shouldn't the distribution approximate…
Guillermo Perez
- 451
35
votes
4 answers
Generating random variables from a mixture of Normal distributions
How can I sample from a mixture distribution, and in particular a mixture of Normal distributions in R? For example, if I wanted to sample from:
$$
0.3\!\times\mathcal{N}(0,1)\; + \;0.5\!\times\mathcal{N}(10,1)\; +…
user30490
35
votes
5 answers
Data "exploration" vs data "snooping"/"torturing"?
Many times I have come across informal warnings against "data snooping" (here's one amusing example), and I think I have an intuitive idea of roughly what that means, and why it may be a problem.
On the other hand, "exploratory data analysis" seems…
kjo
- 1,967