Most Popular
1500 questions
91
votes
6 answers
If I have a 58% chance of winning a point, what's the chance of me winning a ping pong game to 21, win by 2?
I have a bet with a co-worker that out of 50 ping pong games (first to win 21 points, win by 2), I will win all 50. So far we've played 15 games and on average I win 58% of the points, plus I've won all the games so far. So we're wondering if I have…
richard
- 881
91
votes
11 answers
Why should I be Bayesian when my model is wrong?
Edits: I have added a simple example: inference of the mean of the $X_i$. I have also slightly clarified why the credible intervals not matching confidence intervals is bad.
I, a fairly devout Bayesian, am in the middle of a crisis of faith of…
Guillaume Dehaene
- 2,262
91
votes
2 answers
How to normalize data between -1 and 1?
I have seen the min-max normalization formula but that normalizes values between 0 and 1. How would I normalize my data between -1 and 1? I have both negative and positive values in my data matrix.
covfefe
- 1,249
- 2
- 11
- 9
91
votes
3 answers
What is the intuition behind SVD?
I have read about singular value decomposition (SVD). In almost all textbooks it is mentioned that it factorizes the matrix into three matrices with given specification.
But what is the intuition behind splitting the matrix in such form? PCA and…
SHASHANK GUPTA
- 1,309
- 3
- 11
- 17
91
votes
6 answers
Why is the L2 regularization equivalent to Gaussian prior?
I keep reading this and intuitively I can see this but how does one go from L2 regularization to saying that this is a Gaussian Prior analytically? Same goes for saying L1 is equivalent to a Laplacean prior.
Any further references would be great.
Anonymous
- 1,249
- 2
- 11
- 10
91
votes
1 answer
How to apply Neural Network to time series forecasting?
I'm new to machine learning, and I have been trying to figure out how to apply neural network to time series forecasting. I have found resource related to my query, but I seem to still be a bit lost. I think a basic explanation without too much…
solartic
- 1,045
90
votes
3 answers
An example: LASSO regression using glmnet for binary outcome
I am starting to dabble with the use of glmnet with LASSO Regression where my outcome of interest is dichotomous. I have created a small mock data frame below:
age <- c(4, 8, 7, 12, 6, 9, 10, 14, 7)
gender <- c(1, 0, 1, 1, 1, 0, 1, 0, 0)
bmi_p…
Matt Reichenbach
- 3,624
90
votes
26 answers
What is the single most influential book every statistician should read?
If you could go back in time and tell yourself to read a specific book at the beginning of your career as a statistician, which book would it be?
Neil McGuigan
- 9,872
90
votes
2 answers
ImageNet: what is top-1 and top-5 error rate?
In ImageNet classification papers top-1 and top-5 error rates are important units for measuring the success of some solutions, but what are those error rates?
In ImageNet Classification with Deep Convolutional
Neural Networks
by Krizhevsky et al.…
daniel451
- 2,915
90
votes
5 answers
What are modern, easily used alternatives to stepwise regression?
I have a dataset with around 30 independent variables and would like to construct a generalized linear model (GLM) to explore the relationship between them and the dependent variable.
I am aware that the method I was taught for this situation,…
fmark
- 4,977
90
votes
4 answers
How to use Pearson correlation correctly with time series
I have 2 time-series (both smooth) that I would like to cross-correlate to see how correlated they are.
I intend to use the Pearson correlation coefficient. Is this appropriate?
My second question is that I can choose to sample the 2 time-series as…
user1551817
- 1,203
90
votes
4 answers
What're the differences between PCA and autoencoder?
Both PCA and autoencoder can do demension reduction, so what are the difference between them? In what situation I should use one over another?
RockTheStar
- 12,907
- 34
- 71
- 96
90
votes
6 answers
How to tell if data is "clustered" enough for clustering algorithms to produce meaningful results?
How would you know if your (high dimensional) data exhibits enough clustering so that results from kmeans or other clustering algorithm is actually meaningful?
For k-means algorithm in particular, how much of a reduction in within-cluster variance…
xuexue
- 2,188
89
votes
11 answers
What are disadvantages of using the lasso for variable selection for regression?
From what I know, using lasso for variable selection handles the problem of correlated inputs. Also, since it is equivalent to Least Angle Regression, it is not slow computationally. However, many people (for example people I know doing…
xuexue
- 2,188
89
votes
2 answers
Basic question about Fisher Information matrix and relationship to Hessian and standard errors
Ok, this is a quite basic question, but I am a little bit confused. In my thesis I write:
The standard errors can be found by calculating the inverse of the square root of the diagonal elements of the (observed) Fisher Information…
Jen Bohold
- 1,570
- 3
- 13
- 19