Highest Voted Questions - Statistical Analysis Stack Exchange

36

votes

5 answers

Wikipedia entry on likelihood seems ambiguous

I have a simple question regarding "conditional probability" and "Likelihood". (I have already surveyed this question here but to no avail.) It starts from the Wikipedia page on likelihood. They say this: The likelihood of a set of parameter…

asked Jul 15 '16 at 23:44

Creatron

1,655

36

votes

7 answers

Why are symmetric positive definite (SPD) matrices so important?

I know the definition of symmetric positive definite (SPD) matrix, but want to understand more. Why are they so important, intuitively? Here is what I know. What else? For a given data, Co-variance matrix is SPD. Co-variance matrix is a important…

asked Jul 15 '16 at 19:13

Haitao Du

36,852
25
145
242

36

votes

3 answers

What are the differences between Logistic Function and Sigmoid Function?

$$ \begin{equation} f(x)=\frac{L}{1+e^{-k(x-x_0)}} \end{equation} $$ Fig 1. (img) Logistic Function $$ \begin{equation} S(x)= \frac{1}{1+e^{-t}} \end{equation} $$ Fig 2. (img) Sigmoid Function What are the differences between Logistic Function and…

logistic

asked Mar 30 '16 at 07:10

Jul

715

36

votes

3 answers

What is the difference between dropout and drop connect?

What is the difference between dropout and drop connect? AFAIK, dropout randomly drops hidden nodes during training but keeps them in testing, and drop connect drops connections. But isn't dropping connections equivalent to dropping the hidden…

asked Mar 14 '16 at 10:32

Machina333

1,123

36

votes

3 answers

Building an autoencoder in Tensorflow to surpass PCA

Hinton and Salakhutdinov in Reducing the Dimensionality of Data with Neural Networks, Science 2006 proposed a non-linear PCA through the use of a deep autoencoder. I have tried to build and train a PCA autoencoder with Tensorflow several times but I…

asked Jan 11 '16 at 10:35

Donbeo

3,129

36

votes

2 answers

How to make a reward function in reinforcement learning?

While studying Reinforcement Learning, I have come across many forms of the reward function: $R(s,a)$, $R(s,a,s')$, and even a reward function that only depends on the current state. Having said that, I realized it is not very easy to 'make' or…

asked Jan 03 '16 at 14:22

cgo

9,107

36

votes

3 answers

Why are bias nodes used in neural networks?

Why are bias nodes used in neural networks? How many you should use? In which layers you should use them: all hidden layers and the output layer?

asked Dec 09 '15 at 14:51

grmmhp

461

36

votes

2 answers

Understanding distance correlation computations

As far as I understood, distance correlation is a robust and universal way to check if there is a relation between two numeric variables. For example, if we have a set of pairs of numbers: (x1, y1) (x2, y2) ... (xn, yn) we can use distance…

asked Nov 25 '15 at 16:04

Roman

584

36

votes

2 answers

Why is Lasso penalty equivalent to the double exponential (Laplace) prior?

I have read in a number of references that the Lasso estimate for the regression parameter vector $B$ is equivalent to the posterior mode of $B$ in which the prior distribution for each $B_i$ is a double exponential distribution (also known as…

asked Nov 17 '15 at 00:28

Wintermute

1,317

36

votes

1 answer

predict() Function for lmer Mixed Effects Models

The problem: I have read in other posts that predict is not available for mixed effects lmer {lme4} models in [R]. I tried exploring this subject with a toy dataset... Background: The dataset is adapted form this source, and available…

asked Sep 25 '15 at 19:47

Antoni Parellada

26,280

36

votes

3 answers

Pre-training in deep convolutional neural network?

Have anyone seen any literature on pre-training in deep convolutional neural network? I have only seen unsupervised pre-training in autoencoder or restricted boltzman machines.

asked Jul 28 '15 at 18:30

RockTheStar

12,907
34
71
96

36

votes

6 answers

Interpretation of Shapiro-Wilk test

I'm pretty new to statistics and I need your help. I have a small sample, as follows: H4U 0.269 0.357 0.2 0.221 0.275 0.277 0.253 0.127 0.246 I ran the Shapiro-Wilk test using…

asked Sep 17 '11 at 12:18

Jakub

737

36

votes

2 answers

What is "reduced-rank regression" all about?

I have been reading The Elements of Statistical Learning and I could not understand what Section 3.7 "Multiple outcome shrinkage and selection" is all about. It talks about RRR (reduced-rank regression), and I can only understand that the premise is…

asked May 15 '15 at 15:36

cgo

9,107

36

votes

3 answers

How to perform orthogonal regression (total least squares) via PCA?

I always use lm() in R to perform linear regression of $y$ on $x$. That function returns a coefficient $\beta$ such that $$y = \beta x.$$ Today I learned about total least squares and that princomp() function (principal component analysis, PCA) can…

asked Jul 17 '11 at 10:49

Dail

2,637
12
44
54

36

votes

3 answers

Outlier Detection on skewed Distributions

Under a classical definition of an outlier as a data point outide the 1.5* IQR from the upper or lower quartile, there is an assumption of a non-skewed distribution. For skewed distributions (Exponential, Poisson, Geometric, etc) is the best way to…

asked Dec 16 '14 at 05:40

Eric

361

Most Popular