Highest Voted Questions - Statistical Analysis Stack Exchange

71

votes

2 answers

What does the inverse of covariance matrix say about data? (Intuitively)

I'm curious about the nature of $\Sigma^{-1}$. Can anybody tell something intuitive about "What does $\Sigma^{-1}$ say about data?" Edit: Thanks for replies After taking some great courses, I'd like to add some points: It is measure of information,…

asked Oct 22 '13 at 12:00

Arya

973

71

votes

5 answers

How to derive the ridge regression solution?

I am having some issues with the derivation of the solution for ridge regression. I know the regression solution without the regularization term is given by: $$\beta = (X^\top X)^{-1}X^\top y.$$ But after adding the L2 term $\lambda\|\beta\|_2^2$ to…

asked Sep 04 '13 at 15:49

user34790

6,757
10
46
69

71

votes

2 answers

Removing duplicated rows data frame in R

How can I remove duplicate rows from this example data frame? A 1 A 1 A 2 B 4 B 1 B 1 C 2 C 2 I would like to remove the duplicates based on both the columns: A 1 A 2 B 4 B 1 C 2 Order is not important.

r

asked Jan 31 '11 at 19:58

Jana

969

71

votes

2 answers

What is the difference between a partial likelihood, profile likelihood and marginal likelihood?

I see these terms being used and I keep getting them mixed up. Is there a simple explanation of the differences between them?

asked Jul 26 '10 at 09:12

Rob Hyndman

56,782

71

votes

8 answers

Regression with multiple dependent variables?

Is it possible to have a (multiple) regression equation with two or more dependent variables? Sure, you could run two separate regression equations, one for each DV, but that doesn't seem like it would capture any relationship between the two DVs?

regression

asked Nov 14 '10 at 02:50

Jeff

3,927

71

votes

19 answers

What are some valuable Statistical Analysis open source projects?

What are some valuable Statistical Analysis open source projects available right now? Edit: as pointed out by Sharpie, valuable could mean helping you get things done faster or more cheaply.

asked Jul 19 '10 at 19:13

grokus

233

71

votes

4 answers

What is the definition of a "feature map" (aka "activation map") in a convolutional neural network?

Intro Background Within a convolutional neural network, we usually have a general structure / flow that looks like this: input image (i.e. a 2D vector x) (1st Convolutional layer (Conv1) starts here...) convolve a set of filters (w1) along the…

asked Jul 16 '17 at 14:16

Atlas7

813
1
7
7

71

votes

3 answers

Neural Network: For Binary Classification use 1 or 2 output neurons?

Assume I want to do binary classification (something belongs to class A or class B). There are some possibilities to do this in the output layer of a neural network: Use 1 output node. Output 0 (<0.5) is considered class A and 1 (>=0.5) is…

asked Apr 13 '16 at 08:23

robert

1,111

71

votes

5 answers

What problem do shrinkage methods solve?

The holiday season has given me the opportunity to curl up next to the fire with The Elements of Statistical Learning. Coming from a (frequentist) econometrics perspective, I'm having trouble grasping the uses of shrinkage methods like ridge…

asked Dec 27 '11 at 22:35

Charlie

14,062
5
44
72

71

votes

6 answers

Test if two binomial distributions are statistically different from each other

I have three groups of data, each with a binomial distribution (i.e. each group has elements that are either success or failure). I do not have a predicted probability of success, but instead can only rely on the success rate of each as an…

asked Aug 28 '14 at 17:14

Scott

1,030

71

votes

3 answers

Interpreting Residual and Null Deviance in GLM R

How to interpret the Null and Residual Deviance in GLM in R? Like, we say that smaller AIC is better. Is there any similar and quick interpretation for the deviances also? Null deviance: 1146.1 on 1077 degrees of freedom Residual deviance: 4589.4…

asked Jul 23 '14 at 10:18

Anjali

1,011
3
11
10

70

votes

7 answers

Why doesn't Random Forest handle missing values in predictors?

What are theoretical reasons to not handle missing values? Gradient boosting machines, regression trees handle missing values. Why doesn't Random Forest do that?

asked May 16 '14 at 13:08

Fedorenko Kristina

803

70

votes

8 answers

What are good basic statistics to use for ordinal data?

I have some ordinal data gained from survey questions. In my case they are Likert style responses (Strongly Disagree-Disagree-Neutral-Agree-Strongly Agree). In my data they are coded as 1-5. I don't think means would mean much here, so what basic…

asked Jul 19 '10 at 20:23

PaulHurleyuk

1,569

70

votes

6 answers

Standard errors for lasso prediction using R

I'm trying to use a LASSO model for prediction, and I need to estimate standard errors. Surely someone has already written a package to do this. But as far as I can see, none of the packages on CRAN that do predictions using a LASSO will return…

asked Mar 26 '14 at 02:20

Rob Hyndman

56,782

70

votes

6 answers

Is it important to scale data before clustering?

I found this tutorial, which suggests that you should run the scale function on features before clustering (I believe that it converts data to z-scores). I'm wondering whether that is necessary. I'm asking mostly because there's a nice elbow point…

asked Mar 12 '14 at 21:27

Jeremy

1,429

Most Popular