Most Popular
1500 questions
55
votes
1 answer
Alternatives to one-way ANOVA for heteroskedastic data
I have data from 3 groups of algae biomass ($A$, $B$, $C$) which contain unequal sample sizes ($n_A=15$, $n_B=13$, $n_C=12$) and I would like compare if these groups are from the same population.
One-way ANOVA would definitely be the way to go,…
Rick L.
- 551
55
votes
3 answers
Online vs offline learning?
What is the difference between offline and online learning? Is it just a matter of learning over the entire dataset (offline) vs. learning incrementally (one instance at a time)? What are examples of algorithms used in both?
griffin
- 905
55
votes
4 answers
Does the sign of scores or of loadings in PCA or FA have a meaning? May I reverse the sign?
I performed principal component analysis (PCA) with R using two different functions (prcomp and princomp) and observed that the PCA scores differed in sign. How can it be?
Consider this:
set.seed(999)
prcomp(data.frame(1:10,rnorm(10)))$x
…
user1320502
- 1,007
55
votes
5 answers
Correct spelling (capitalization, italicization, hyphenation) of "p-value"?
I realize this is pedantic and trite, but as a researcher in a field outside of statistics, with limited formal education in statistics, I always wonder if I'm writing "p-value" correctly. Specifically:
Is the "p" supposed to be capitalized?
Is the…
gotgenes
- 943
- 2
- 8
- 9
55
votes
5 answers
Using deep learning for time series prediction
I'm new in area of deep learning and for me first step was to read interesting articles from deeplearning.net site. In papers about deep learning, Hinton and others mostly talk about applying it to image problems. Can someone try to answer me can it…
Vedran
- 651
- 1
- 6
- 4
55
votes
8 answers
Why is Entropy maximised when the probability distribution is uniform?
I know that entropy is the measure of randomness of a process/variable and it can be defined as follows. for a random variable $X \in$ set $A$ :- $H(X)= \sum_{x_i \in A} -p(x_i) \log (p(x_i)) $. In the book on Entropy and Information Theory by…
user76170
- 789
55
votes
4 answers
Difference between forecast and prediction?
I was wondering what difference and relation are between forecast and prediction? Especially in time series and regression?
For example, am I correct that:
In time series, forecasting seems to mean to estimate a future values given past values of a…
Tim
- 19,445
55
votes
3 answers
How can I calculate $\int^{\infty}_{-\infty}\Phi\left(\frac{w-a}{b}\right)\phi(w)\,\mathrm dw$
Suppose $\phi(\cdot)$ and $\Phi(\cdot)$ are density function and distribution function of the standard normal distribution.
How can one calculate the integral:
$$\int^{\infty}_{-\infty}\Phi\left(\frac{w-a}{b}\right)\phi(w)\,\mathrm dw$$
hadisanji
- 885
55
votes
4 answers
What is difference-in-differences?
Difference in differences has long been popular as a non-experimental tool, especially in economics. Can somebody please provide a clear and non-technical answer to the following questions about difference-in-differences.
What is a…
Graham Cookson
- 8,061
55
votes
4 answers
How do we decide when a small sample is statistically significant or not?
Sorry if the title isn't clear, I'm not a statistician, and am not sure how to phrase this.
I was looking at the global coronavirus statistics on worldometers, and sorted the table by cases per million population to get an idea of how different…
Avrohom Yisroel
- 885
55
votes
4 answers
Fast linear regression robust to outliers
I am dealing with linear data with outliers, some of which are at more the 5 standard deviations away from the estimated regression line. I'm looking for a linear regression technique that reduces the influence of these points.
So far what I did is…
Matteo Fasiolo
- 3,254
- 2
- 23
- 29
55
votes
3 answers
What is a latent space?
In the context of machine learning, I often hear the term latent space, sometimes qualified with the word "high dimensional" or "low dimensional" latent space.
I am a bit puzzled by this term (as it is almost never defined rigorously).
Can someone…
Fraïssé
- 1,540
55
votes
8 answers
Is sampling relevant in the time of 'big data'?
Or more so "will it be"? Big Data makes statistics and relevant knowledge all the more important but seems to underplay Sampling Theory.
I've seen this hype around 'Big Data' and can't help wonder that "why" would I want to analyze everything?…
PhD
- 14,627
55
votes
2 answers
What is it meant with the $\sigma$-algebra generated by a random variable?
Often, in the course of my (self-)study of statistics, I've met the terminology "$\sigma$-algebra generated by a random variable". I don't understand the definition on Wikipedia, but most importantly I don't get the intuition behind it. Why/when do…
DeltaIV
- 17,954
55
votes
5 answers
What is difference between “in-sample” and “out-of-sample” forecasts?
I don't understand what exactly is the difference between "in-sample" and "out of sample" prediction?
An in-sample forecast utilizes a subset of the available data to forecast values outside of the estimation period. An out of sample forecast…
Engin YILMAZ
- 685