Highest Voted Questions - Statistical Analysis Stack Exchange

48

votes

6 answers

What are best practices in identifying interaction effects?

Other than literally testing each possible combination of variable(s) in a model (x1:x2 or x1*x2 ... xn-1 * xn). How do you identify if an interaction SHOULD or COULD exist between your independent (hopefully) variables? What are best practices in…

asked Nov 25 '10 at 05:32

Brandon Bertelsen

7,232
9
41
48

48

votes

6 answers

Why don't linear regression assumptions matter in machine learning?

When I learned linear regression in my statistics class, we are asked to check for a few assumptions which need to be true for linear regression to make sense. I won't delve deep into those assumptions, however, these assumptions don't appear when…

asked Sep 09 '20 at 01:10

kamal tanwar

591

48

votes

1 answer

Intuition behind tensor product interactions in GAMs (MGCV package in R)

Generalized additive models are those where $$ y = \alpha + f_1(x_1) + f_2(x_2) + e_i $$ for example. the functions are smooth, and to be estimated. Usually by penalized splines. MGCV is a package in R that does so, and the author (Simon Wood)…

asked Dec 08 '12 at 21:30

generic_user

13,339

48

votes

3 answers

How do DAGs help to reduce bias in causal inference?

I have read in several places that the use of DAGs can help to reduce bias due to Confounding Differential Selection Mediation Conditioning on a collider I also see the term “backdoor path” a lot. How do we use DAGs to reduce these biases, and…

asked Jan 20 '20 at 08:00

LeelaSella

2,030

48

votes

5 answers

Fake uniform random numbers: More evenly distributed than true uniform data

I'm looking for a way to generate random numbers that appear to be uniform distributed -- and every test will show them to be uniform -- except that they are more evenly distributed than true uniform data. The problem I have with the "true" uniform…

asked Oct 14 '12 at 15:47

Has QUIT--Anony-Mousse

42,358

48

votes

3 answers

When to use a GAM vs GLM

I realize this may be a potentially broad question, but I was wondering whether there are assumptions that indicate the use of a GAM (Generalized additive model) over a GLM (Generalized linear model)? Someone recently told me that GAMs should only…

asked Dec 05 '18 at 11:05

mluerig

701

48

votes

3 answers

SVM, Overfitting, curse of dimensionality

My dataset is small (120 samples), however the number of features are large varies from (1000-200,000). Although I'm doing feature selection to pick a subset of features, it might still overfit. My first question is, how does SVM handle…

asked Aug 28 '12 at 20:12

user13420

875

48

votes

1 answer

Why KL divergence is non-negative?

Why is KL divergence non-negative? From the perspective of information theory, I have such an intuitive understanding: Say there are two ensembles $A$ and $B$ which are composed of the same set of elements labeled by $x$. $p(x)$ and $q(x)$ are…

asked Mar 18 '18 at 10:43

meTchaikovsky

1,832

48

votes

1 answer

PCA objective function: what is the connection between maximizing variance and minimizing error?

The PCA algorithm can be formulated in terms of the correlation matrix (assume the data $X$ has already been normalized and we are only considering projection onto the first PC). The objective function can be written as: $$ \max_w (Xw)^T(Xw)\; \:…

asked Jul 12 '12 at 15:09

Cam.Davidson.Pilon

12,153

48

votes

8 answers

Pitfalls in time series analysis

I am just starting out self-learning in time series analysis. I have noticed that there are a number of potential pitfalls that are not applicable to general statistics. So, building on What are common statistical sins?, I would like to ask: What…

asked Apr 26 '12 at 00:40

naught101

5,453

48

votes

5 answers

What is the difference between a population and a sample?

What is the difference between a population and a sample? What common variables and statistics are used for each one, and how do those relate to each other?

asked Jul 20 '10 at 11:07

Baltimark

2,268

48

votes

2 answers

What exactly is the alpha in the Dirichlet distribution?

I'm fairly new to Bayesian statistics and I came across a corrected correlation measure, SparCC, that uses the Dirichlet process in the backend of it's algorithm. I have been trying to go through the algorithm step-by-step to really understand what…

asked Nov 08 '16 at 18:38

O.rka

1,442
4
21
32

48

votes

4 answers

When should I balance classes in a training data set?

I had an online course, where I learned, that unbalanced classes in the training data might lead to problems, because classification algorithms go for the majority rule, as it gives good results if the unbalance is too much. In an assignment one had…

asked Aug 03 '16 at 14:59

Zelphir Kaltstahl

673

48

votes

4 answers

Evaluation measures of goodness or validity of clustering (without having truth labels)

I'm clustering a set of data but I don't have truth document that allow me to evaluate the result of clustering (I have unlabelled data), so I can not use an external evaluation measure. In this case, is there any efficient evaluation measures -…

asked Jan 27 '12 at 12:43

shn

2,959

48

votes

5 answers

How do I fit a constrained regression in R so that coefficients total = 1?

I see a similar constrained regression here: Constrained linear regression through a specified point but my requirement is slightly different. I need the coefficients to add up to 1. Specifically I am regressing the returns of 1 foreign exchange…

asked Jan 23 '12 at 16:42

Thomas Browne

631

Most Popular