Highest Voted Questions - Statistical Analysis Stack Exchange

50

votes

4 answers

Taking the expectation of Taylor series (especially the remainder)

My question concerns trying to justify a widely-used method, namely taking the expected value of Taylor Series. Assume we have a random variable $X$ with positive mean $\mu$ and variance $\sigma^2$. Additionally, we have a function, say,…

asked Sep 19 '13 at 14:03

agronskiy

685
1
6
7

50

votes

5 answers

Normality of dependent variable = normality of residuals?

This issue seems to rear its ugly head all the time, and I'm trying to decapitate it for my own understanding of statistics (and sanity!). The assumptions of general linear models (t-test, ANOVA, regression etc.) include the "assumption of…

asked May 30 '13 at 05:36

DeanP

871

50

votes

3 answers

Intuitive difference between hidden Markov models and conditional random fields

I understand that HMMs (Hidden Markov Models) are generative models, and CRF are discriminative models. I also understand how CRFs (Conditional Random Fields) are designed and used. What I do not understand is how they are different from HMMs? I…

asked May 05 '13 at 22:58

user1343318

1,341

50

votes

3 answers

PCA and the train/test split

I have a dataset for which I have multiple sets of binary labels. For each set of labels, I train a classifier, evaluating it by cross-validation. I want to reduce dimensionality using principal component analysis (PCA). My question is: Is it…

asked Apr 10 '13 at 14:06

Bitwise

6,619

50

votes

2 answers

Logistic regression model does not converge

I've got some data about airline flights (in a data frame called flights) and I would like to see if the flight time has any effect on the probability of a significantly delayed arrival (meaning 10 or more minutes). I figured I'd use logistic…

asked Dec 10 '10 at 16:28

Daniel Standage

1,269

50

votes

7 answers

Features for time series classification

I consider the problem of (multiclass) classification based on time series of variable length $T$, that is, to find a function $$f(X_T) = y \in [1..K]\\ \text{for } X_T = (x_1, \dots, x_T)\\ \text{with } x_t \in \mathbb{R}^d ~,$$ via a global…

asked Feb 25 '13 at 12:34

Emile

3,460

50

votes

2 answers

Dealing with singular fit in mixed models

Let's say we have a model mod <- Y ~ X*Condition + (X*Condition|subject) # Y = logit variable # X = continuous variable # Condition = values A and B, dummy coded; the design is repeated # so all participants go through both…

asked Nov 27 '18 at 00:15

User33268

1,722

50

votes

4 answers

Where does $\sqrt{n}$ come from in central limit theorem (CLT)?

A very simple version of central limited theorem as below $$ \sqrt{n}\bigg(\bigg(\frac{1}{n}\sum_{i=1}^n X_i\bigg) - \mu\bigg)\ \xrightarrow{d}\ \mathcal{N}(0,\;\sigma^2) $$ which is Lindeberg–Lévy CLT. I do not understand why there is a $\sqrt{n}$…

asked Sep 11 '12 at 17:25

Flying pig

6,239

50

votes

7 answers

Is Amazon's "average rating" misleading?

If I understand correctly, book ratings on a 1-5 scale are Likert scores. That is, a 3 for me may not necessarily be a 3 for someone else. It's an ordinal scale IMO. One shouldn't really average ordinal scales but can definitely take the mode,…

asked Jul 03 '12 at 21:51

PhD

14,627

50

votes

2 answers

Can somebody explain to me NUTS in english?

My understanding of the algorithm is the following: No U-Turn Sampler (NUTS) is a Hamiltonian Monte Carlo Method. This means that it is not a Markov Chain method and thus, this algorithm avoids the random walk part, which is often deemed as…

asked Nov 04 '17 at 03:29

user3007270

631

50

votes

2 answers

What is model identifiability?

I know that with a model that is not identifiable the data can be said to be generated by multiple different assignments to the model parameters. I know that sometimes it's possible to constrain parameters so that all are identifiable, as in the…

identifiability

asked Jan 05 '12 at 02:59

Jack Tanner

4,842

50

votes

2 answers

Poisson regression to estimate relative risk for binary outcomes

Brief Summary Why is it more common for logistic regression (with odds ratios) to be used in cohort studies with binary outcomes, as opposed to Poisson regression (with relative risks)? Background Undergraduate and graduate statistics and…

asked Nov 18 '11 at 18:10

jthetzel

2,437

50

votes

1 answer

Difference between GradientDescentOptimizer and AdamOptimizer (TensorFlow)?

I've written a simple MLP in TensorFlow which is modelling a XOR-Gate. So for: input_data = [[0., 0.], [0., 1.], [1., 0.], [1., 1.]] it should produce the following: output_data = [[0.], [1.], [1.], [0.]] The network has an input layer, a hidden…

asked Dec 01 '15 at 13:48

daniel451

2,915

50

votes

8 answers

What are the cons of Bayesian analysis?

What are some practical objections to the use of Bayesian statistical methods in any context? No, I don't mean the usual carping about choice of prior. I'll be delighted if this gets no answers.

bayesian

asked Oct 17 '11 at 20:33

user6666

50

votes

6 answers

What can we say about population mean from a sample size of 1?

I am wondering what we can say, if anything, about the population mean, $\mu$ when all I have is one measurement, $y_1$ (sample size of 1). Obviously, we'd love to have more measurements, but we can't get them. It seems to me that since the sample…

asked Jun 18 '15 at 15:21

thedu

525

Most Popular