Highest Voted Questions - Statistical Analysis Stack Exchange

40

votes

4 answers

Independent variable = Random variable?

I'm slightly confused if an independent variable (also called predictor or feature) in a statistical model, for example the $X$ in linear regression $Y=\beta_0+\beta_1 X$, is a random variable ?

asked Nov 15 '16 at 12:19

l7ll7

1,275

40

votes

1 answer

How to interpret variance and correlation of random effects in a mixed-effects model?

I hope you all don't mind this question, but I need help interpreting output for a linear mixed effects model output I've been trying to learn to do in R. I am new to longitudinal data analysis and linear mixed effects regression. I have a model I…

asked Mar 10 '12 at 21:30

Zeda

501
1
5
3

40

votes

6 answers

Effect size as the hypothesis for significance testing

Today, at the Cross Validated Journal Club (why weren't you there?), @mbq asked: Do you think we (modern data scientists) know what significance means? And how it relates to our confidence in our results? @Michelle replied as some (including me)…

asked Feb 24 '12 at 16:46

Carlos Accioly

5,025
4
28
29

40

votes

2 answers

Is Tikhonov regularization the same as Ridge Regression?

Tikhonov regularization and ridge regression are terms often used as if they were identical. Is it possible to specify exactly what the difference is?

asked Sep 10 '16 at 04:44

Carl

13,084

40

votes

2 answers

Bootstrap prediction interval

Is there any bootstrap technique available to compute prediction intervals for point predictions obtained e.g. from linear regression or other regression method (k-nearest neighbour, regression trees etc.)? Somehow I feel that the sometimes proposed…

asked Jul 31 '16 at 16:09

Michael M

11,815
5
33
50

40

votes

4 answers

Why use colormap viridis over jet?

As announced in https://www.youtube.com/watch?v=xAoljeRJ3lU, Matplotlib changes the default colormap from jet to viridis. However, I don't understand it pretty well. Maybe because I'm color blind? The original colormap jet looks very strong, I can…

data-visualization

asked Jul 12 '16 at 10:57

ZK Zhao

1,275

40

votes

5 answers

Difference between feedback RNN and LSTM/GRU

I am trying to understand different Recurrent Neural Network (RNN) architectures to be applied to time series data and I am getting a bit confused with the different names that are frequently used when describing RNNs. Is the structure of Long…

asked Jul 07 '16 at 12:53

Josie

503

40

votes

3 answers

Meaning (and proof) of "RNN can approximate any algorithm"

Recently I read that a recurrent neural network can approximate any algorithm. So my question is: what does this exactly mean and can you give me a reference where this is proved?

asked Jun 27 '16 at 19:34

user3726947

503
1
5
6

40

votes

1 answer

XGBoost Loss function Approximation With Taylor Expansion

As an example, take the objective function of the XGBoost model on the $t$'th iteration: $$\mathcal{L}^{(t)}=\sum_{i=1}^n\ell(y_i,\hat{y}_i^{(t-1)}+f_t(\mathbf{x}_i))+\Omega(f_t)$$ where $\ell$ is the loss function, $f_t$ is the $t$'th tree output…

asked Mar 21 '16 at 19:04

Alex R.

13,897

40

votes

3 answers

What does entropy tell us?

I am reading about entropy and am having a hard time conceptualizing what it means in the continuous case. The wiki page states the following: The probability distribution of the events, coupled with the information amount of every event, forms…

entropy

asked Feb 05 '16 at 18:52

RustyStatistician

1,989

40

votes

1 answer

Why do we need to normalize the images before we put them into CNN?

I am not clear the reason that we normalise the image for CNN by (image - mean_image)? Thanks!

asked Dec 09 '15 at 06:54

Zhi Lu

737

40

votes

5 answers

How to derive the likelihood function for binomial distribution for parameter estimation?

According to Miller and Freund's Probability and Statistics for Engineers, 8ed (pp.217-218), the likelihood function to be maximised for binomial distribution (Bernoulli trials) is given as $L(p) = \prod_{i=1}^np^{x_i}(1-p)^{1-x_i}$ How to arrive at…

asked Nov 10 '15 at 12:00

Ébe Isaac

1,082

40

votes

3 answers

Theory behind partial least squares regression

Can anyone recommend a good exposition of the theory behind partial least squares regression (available online) for someone who understands SVD and PCA? I have looked at many sources online and have not found anything that had the right combination…

asked Nov 02 '15 at 01:38

ClarPaul

1,270
1
12
19

40

votes

5 answers

Do working statisticians care about the difference between frequentist and Bayesian inference?

As an outsider, it appears that there are two competing views on how one should perform statistical inference. Are the two different methods both considered valid by working statisticians? Is choosing one considered more of a philosophical…

asked Aug 12 '10 at 20:09

Jonathan Fischoff

231
3
7

40

votes

3 answers

Do we need gradient descent to find the coefficients of a linear regression model?

I was trying to learn machine learning using the Coursera material. In this lecture, Andrew Ng uses gradient descent algorithm to find the coefficients of the linear regression model that will minimize the error function (cost function). For linear…

asked Jul 06 '15 at 17:18

Victor

6,565

Most Popular