Highest Voted Questions - Statistical Analysis Stack Exchange

56

votes

2 answers

Determining sample size necessary for bootstrap method / Proposed Method

I know this is a rather hot topic where no one really can give a simple answer for. Nevertheless I am wondering if the following approach couldn’t be useful. The bootstrap method is only useful if your sample follows more or less (read exactly) the…

asked Jul 29 '12 at 14:02

siegfried

579

56

votes

9 answers

Bayesian vs frequentist Interpretations of Probability

Can someone give a good rundown of the differences between the Bayesian and the frequentist approach to probability? From what I understand: The frequentists view is that the data is a repeatable random sample (random variable) with a specific…

asked Jul 05 '12 at 14:41

BYS2

1,505
2
15
20

56

votes

5 answers

What is the difference between the forward-backward and Viterbi algorithms?

I want to know what the differences between the forward-backward algorithm and the Viterbi algorithm for inference in hidden Markov models (HMM) are.

asked Jul 06 '12 at 03:46

user34790

6,757
10
46
69

56

votes

7 answers

Best PCA algorithm for huge number of features (>10K)?

I previously asked this on StackOverflow, but it seems like it might be more appropriate here, given that it didn't get any answers on SO. It's kind of at the intersection between statistics and programming. I need to write some code to do PCA…

asked Sep 18 '10 at 02:08

dsimcha

8,739

56

votes

8 answers

Danger of setting all initial weights to zero in Backpropagation

Why is it dangerous to initialize weights with zeros? Is there any simple example that demonstrates it?

asked Apr 25 '12 at 18:21

user8078

663

56

votes

11 answers

Deriving Bellman's Equation in Reinforcement Learning

I see the following equation in "In Reinforcement Learning. An Introduction", but don't quite follow the step I have highlighted in blue below. How exactly is this step derived?

asked Oct 31 '16 at 14:01

Amelio Vazquez-Reina

19,346

56

votes

6 answers

Understanding LSTM units vs. cells

I have been studying LSTMs for a while. I understand at a high level how everything works. However, going to implement them using Tensorflow I've noticed that BasicLSTMCell requires a number of units (i.e. num_units) parameter. From this very…

asked Oct 23 '16 at 23:37

user124589

56

votes

3 answers

Regularization methods for logistic regression

Regularization using methods such as Ridge, Lasso, ElasticNet is quite common for linear regression. I wanted to know the following: Are these methods applicable for logistic regression? If so, are there any differences in the way they need to be…

asked Aug 08 '16 at 10:29

Tapan Khopkar

846
2
8
9

56

votes

9 answers

Modern successor to Exploratory Data Analysis by Tukey?

I've been reading Tukey's book "Exploratory Data Analysis". Being written in 1977, the book emphasizes paper/pencil methods. Is there a more 'modern' successor which takes into account that we can now instantaneosly plot large data sets?

asked Feb 08 '12 at 08:18

biofreezer

315

56

votes

2 answers

Choosing the right linkage method for hierarchical clustering

I am performing hierarchical clustering on data I've gathered and processed from the reddit data dump on Google BigQuery. My process is the following: Get the latest 1000 posts in /r/politics Gather all the comments Process the data and compute an…

asked Feb 13 '16 at 22:09

Kevbot

661

56

votes

3 answers

What is pre training a neural network?

Well the question says it all. What is meant by "pre training a neural network"? Can someone explain in pure simple English? I can't seem to find any resources related to it. It would be great if someone can point me to them.

asked Jan 29 '16 at 13:12

Machina333

1,123

56

votes

2 answers

Intuitive explanations of differences between Gradient Boosting Trees (GBM) & Adaboost

I'm trying to understand the differences between GBM & Adaboost. These are what I've understood so far: There are both boosting algorithms, which learns from previous model's errors and finally make a weighted sum of the models. GBM and Adaboost…

asked Aug 01 '15 at 07:50

Hee Kyung Yoon

697

56

votes

4 answers

If the t-test and the ANOVA for two groups are equivalent, why aren't their assumptions equivalent?

I'm sure I've got this completely wrapped round my head, but I just can't figure it out. The t-test compares two normal distributions using the Z distribution. That's why there's an assumption of normality in the DATA. ANOVA is equivalent to linear…

asked Aug 13 '10 at 09:41

Chris Beeley

5,761

56

votes

4 answers

Replicating Stata's "robust" option in R

I have been trying to replicate the results of the Stata option robust in R. I have used the rlm command form the MASS package and also the command lmrob from the package "robustbase". In both cases the results are quite different from the "robust"…

asked Sep 28 '14 at 12:42

user56579

561

55

votes

3 answers

Why is polynomial regression considered a special case of multiple linear regression?

If polynomial regression models nonlinear relationships, how can it be considered a special case of multiple linear regression? Wikipedia notes that "Although polynomial regression fits a nonlinear model to the data, as a statistical estimation…

asked Apr 01 '14 at 00:42

gavinmh

1,095

Most Popular