Highest Voted Questions - Statistical Analysis Stack Exchange

48

votes

2 answers

How to model non-negative zero-inflated continuous data?

I'm currently trying to apply a linear model (family = gaussian) to an indicator of biodiversity that cannot take values lower than zero, is zero-inflated and is continuous. Values range from 0 to a little over 0.25. As a consequence, there is quite…

asked Dec 21 '15 at 21:57

David

481

48

votes

3 answers

How are Random Forests not sensitive to outliers?

I've read in a few sources, including this one, that Random Forests are not sensitive to outliers (in the way that Logistic Regression and other ML methods are, for example). However, two pieces of intuition tell me otherwise: Whenever a decision…

asked Dec 17 '15 at 06:23

makansij

2,279
9
31
42

48

votes

7 answers

Where to start with statistics for an experienced developer

During the first half of 2015 I did the coursera course of Machine Learning (by Andrew Ng, GREAT course). And learned the basics of machine learning (linear regression, logistic regression, SVM, Neuronal Networks...) Also I have been a developer for…

asked Oct 13 '15 at 01:57

Juan Antonio Gomez Moriano

1,329
1
13
16

48

votes

6 answers

How do I avoid overlapping labels in an R plot?

I'm trying to label a pretty simple scatterplot in R. This is what I use: plot(SI, TI) text(SI, TI, Name, pos=4, cex=0.7) The result is mediocre, as you can see (click to enlarge): I tried to compensate for this using the textxy function, but it's…

asked Sep 26 '11 at 13:27

slhck

837

48

votes

7 answers

Combining probabilities/information from different sources

Lets say I have three independent sources and each of them make predictions for the weather tomorrow. The first one says that the probability of rain tomorrow is 0, then the second one says that the probability is 1, and finally the last one says…

asked Jun 06 '15 at 22:25

Biela Diela

601
1
6
5

48

votes

4 answers

How to calculate a confidence level for a Poisson distribution?

Would like to know how confident I can be in my $\lambda$. Anyone know of a way to set upper and lower confidence levels for a Poisson distribution? Observations ($n$) = 88 Sample mean ($\lambda$) = 47.18182 what would the 95% confidence look…

asked Sep 09 '11 at 12:25

Travis

771

48

votes

7 answers

Why shouldn't the denominator of the covariance estimator be n-2 rather than n-1?

The denominator of the (unbiased) variance estimator is $n-1$ as there are $n$ observations and only one parameter is being estimated. $$ \mathbb{V}\left(X\right)=\frac{\sum_{i=1}^{n}\left(X_{i}-\overline{X}\right)^{2}}{n-1} $$ By the same token I…

asked Mar 19 '15 at 13:13

MYaseen208

2,719

48

votes

1 answer

Proof that the coefficients in an OLS model follow a t-distribution with (n-k) degrees of freedom

Background Suppose we have an Ordinary Least Squares model where we have $k$ coefficients in our regression model, $$\mathbf{y}=\mathbf{X}\mathbf{\beta} + \mathbf{\epsilon}$$ where $\mathbf{\beta}$ is an $(k\times1)$ vector of coefficients,…

asked Oct 01 '14 at 01:12

Garrett

661

47

votes

4 answers

Functions of Independent Random Variables

Is the claim that functions of independent random variables are themselves independent, true? I have seen that result often used implicitly in some proofs, for example in the proof of independence between the sample mean and the sample variance of…

asked Apr 23 '14 at 14:39

JohnK

20,366

47

votes

4 answers

What is the difference between finite and infinite variance

What is the difference between finite and infinite variance ? My stats knowledge is rather basic; Wikipedia / Google wasn't much help here.

asked Apr 19 '14 at 22:16

AfterWorkGuinness

623

47

votes

2 answers

Why is logistic regression a linear model?

I want to know why logistic regression is called a linear model. It uses a sigmoid function, which is not linear. So why is logistic regression a linear model?

asked Mar 03 '14 at 17:52

user34790

6,757
10
46
69

47

votes

4 answers

McFadden's Pseudo-$R^2$ Interpretation

I have a binary logistic regression model with a McFadden's pseudo R-squared of 0.192 with a dependent variable called payment (1 = payment and 0 = no payment). What is the interpretation of this pseudo R-squared? Is it a relative comparison for…

asked Jan 13 '14 at 14:01

Matt Reichenbach

3,624

47

votes

8 answers

Rigorous definition of an outlier?

People often talk about dealing with outliers in statistics. The thing that bothers me about this is that, as far as I can tell, the definition of an outlier is completely subjective. For example, if the true distribution of some random variable…

asked Feb 13 '11 at 15:07

dsimcha

8,739

47

votes

1 answer

Neural Networks: weight change momentum and weight decay

Momentum $\alpha$ is used to diminish the fluctuations in weight changes over consecutive iterations: $$\Delta\omega_i(t+1) = - \eta\frac{\partial E}{\partial w_i} + \alpha \Delta \omega_i(t),$$ where $E({\bf w})$ is the error function, ${\bf w}$ -…

asked Sep 16 '13 at 01:56

Oleg Shirokikh

895
1
9
18

47

votes

3 answers

whether to rescale indicator / binary / dummy predictors for LASSO

For the LASSO (and other model selecting procedures) it is crucial to rescale the predictors. The general recommendation I follow is simply to use a 0 mean, 1 standard deviation normalization for continuous variables. But what is there to do with…

asked Sep 09 '13 at 14:46

László

987

Most Popular