Highest Voted Questions - Statistical Analysis Stack Exchange

40

votes

9 answers

Variance of a bounded random variable

Suppose that a random variable has a lower and an upper bound [0,1]. How to compute the variance of such a variable?

asked Dec 10 '12 at 13:39

Piotr

501

40

votes

3 answers

What's the relation between hierarchical models, neural networks, graphical models, bayesian networks?

They all seem to represent random variables by the nodes and (in)dependence via the (possibly directed) edges. I'm esp interested in a bayesian's point-of-view.

asked Nov 13 '10 at 05:43

cespinoza

802

40

votes

4 answers

Simple way to algorithmically identify a spike in recorded errors

We need an early warning system. I am dealing with a server that is known to have performance issues under load. Errors are recorded in a database along with a timestamp. There are some manual intervention steps that can be taken to decrease the…

asked Oct 24 '12 at 21:44

dbenton

503

40

votes

7 answers

Can cross validation be used for causal inference?

In all contexts I am familiar with cross-validation it is solely used with the goal of increasing predictive accuracy. Can the logic of cross validation be extended in estimating the unbiased relationships between variables? While this paper by…

asked Oct 22 '10 at 15:12

Andy W

16,026

40

votes

1 answer

What are easy to interpret, goodness of fit measures for linear mixed effects models?

I am currently using the R package lme4. I am using a linear mixed effects models with random effects: library(lme4) mod1 <- lmer(r1 ~ (1 | site), data = sample_set) #Only random effects mod2 <- lmer(r1 ~ p1 + (1 | site), data = sample_set) #One…

asked Sep 25 '12 at 07:35

mjburns

1,107

40

votes

7 answers

Why doesn't regularization solve Deep Neural Nets hunger for data?

An issue I've seen frequently brought up in the context of Neural Networks in general, and Deep Neural Networks in particular, is that they're "data hungry" - that is they don't perform well unless we have a large data set with which to train the…

asked May 11 '18 at 18:18

Skander H.

11,888
2
41
97

40

votes

4 answers

X and Y are not correlated, but X is significant predictor of Y in multiple regression. What does it mean?

X and Y are not correlated (-.01); however, when I place X in a multiple regression predicting Y, alongside three (A, B, C) other (related) variables, X and two other variables (A, B) are significant predictors of Y. Note that the two other (A, B)…

asked Aug 07 '12 at 21:09

Behacad

5,064
8
35
49

40

votes

1 answer

When to choose SARSA vs. Q Learning

SARSA and Q Learning are both reinforcement learning algorithms that work in a similar way. The most striking difference is that SARSA is on policy while Q Learning is off policy. The update rules are as follows: Q…

reinforcement-learning

asked Feb 04 '18 at 18:06

hh32

1,421

40

votes

3 answers

What is the "capacity" of a machine learning model?

I'm studying this Tutorial on Variational Autoencoders by Carl Doersch. In the second page it states: One of the most popular such frameworks is the Variational Autoencoder [1, 3], the subject of this tutorial. The assumptions of this model are…

asked Nov 07 '17 at 11:26

Andrés Marafioti

622

40

votes

6 answers

Sampling for Imbalanced Data in Regression

There have been good questions on handling imbalanced data in the classification context, but I am wondering what people do to sample for regression. Say the problem domain is very sensitive to the sign but only somewhat sensitive to the magnitude…

asked Jun 09 '12 at 22:52

someben

798

40

votes

5 answers

What is the difference between Conv1D and Conv2D?

I was going through the keras convolution docs and I have found two types of convultuion Conv1D and Conv2D. I did some web search and this is what I understands about Conv1D and Conv2D; Conv1D is used for sequences and Conv2D uses for images. I…

asked Jul 31 '17 at 13:11

Eka

2,251

40

votes

3 answers

Why are Decision Trees not computationally expensive?

In An Introduction to Statistical Learning with Applications in R, the authors write that fitting a decision tree is very fast, but this doesn't make sense to me. The algorithm has to go through every feature and partition it in every way possible…

cart

asked Jul 24 '17 at 02:05

DataOrc

451

40

votes

5 answers

How is the cost function from Logistic Regression differentiated

I am doing the Machine Learning Stanford course on Coursera. In the chapter on Logistic Regression, the cost function is this: Then, it is differentiated here: I tried getting the derivative of the cost function, but I got something completely…

asked May 10 '17 at 18:28

bsky

1,199

40

votes

4 answers

ROC vs Precision-recall curves on imbalanced dataset

I just finished reading this discussion. They argue that PR AUC is better than ROC AUC on imbalanced dataset. For example, we have 10 samples in test dataset. 9 samples are positive and 1 is negative. We have a terrible model which predicts…

asked Feb 18 '17 at 04:17

machineLearner

401

40

votes

4 answers

Raw or orthogonal polynomial regression?

I want to regress a variable $y$ onto $x,x^2,\ldots,x^5$. Should I do this using raw or orthogonal polynomials? I looked at the question on the site that deals with these, but I don't really understand what's the difference between using them. Why…

asked Jan 26 '17 at 16:11

l7ll7

1,275

Most Popular