Most Popular

1500 questions
95
votes
4 answers

What is the difference between a "link function" and a "canonical link function" for GLM

What's the difference between terms 'link function' and 'canonical link function'? Also, are there any (theoretical) advantages of using one over the other? For example, a binary response variable can be modeled using many link functions such as…
steadyfish
  • 1,922
95
votes
7 answers

Explain the difference between multiple regression and multivariate regression, with minimal use of symbols/math

Are multiple and multivariate regression really different? What is a variate anyways?
95
votes
5 answers

Understanding the role of the discount factor in reinforcement learning

I'm teaching myself about reinforcement learning, and trying to understand the concept of discounted reward. So the reward is necessary to tell the system which state-action pairs are good, and which are bad. But what I don't understand is why the…
Karnivaurus
  • 7,019
95
votes
6 answers

Essential data checking tests

In my job role I often work with other people's datasets, non-experts bring me clinical data and I help them to summarise it and perform statistical tests. The problem I am having is that the datasets I am brought are almost always riddled with…
Chris Beeley
  • 5,761
94
votes
7 answers

How to generate uniformly distributed points on the surface of the 3-d unit sphere?

I am wondering how to generate uniformly distributed points on the surface of the 3-d unit sphere? Also after generating those points, what is the best way to visualize and check whether they are truly uniform on the surface $x^2+y^2+z^2=1$?
Qiang Li
  • 1,295
94
votes
6 answers

Feature selection for "final" model when performing cross-validation in machine learning

I am getting a bit confused about feature selection and machine learning and I was wondering if you could help me out. I have a microarray dataset that is classified into two groups and has 1000s of features. My aim is to get a small number of…
94
votes
3 answers

Why is ridge regression called "ridge", why is it needed, and what happens when $\lambda$ goes to infinity?

Ridge regression coefficient estimate $\hat{\beta}^R$ are the values that minimize the $$ \text{RSS} + \lambda \sum_{j=1}^p\beta_j^2. $$ My questions are: If $\lambda = 0$, then we see that the expression above reduces to the usual RSS. What if…
cgo
  • 9,107
94
votes
5 answers

What do the residuals in a logistic regression mean?

In answering this question John Christie suggested that the fit of logistic regression models should be assessed by evaluating the residuals. I'm familiar with how to interpret residuals in OLS, they are in the same scale as the DV and very clearly…
russellpierce
  • 18,599
94
votes
4 answers

Can bootstrap be seen as a "cure" for the small sample size?

This question has been triggered by something I read in this graduate-level statistics textbook and also (independently) heard during this presentation at a statistical seminar. In both cases, the statement was along the lines of "because the sample…
James
  • 2,870
93
votes
7 answers

What are principal component scores?

What are principal component scores (PC scores, PCA scores)?
vrish88
  • 1,213
93
votes
2 answers

Resampling / simulation methods: monte carlo, bootstrapping, jackknifing, cross-validation, randomization tests, and permutation tests

I am trying to understand difference between different resampling methods (Monte Carlo simulation, parametric bootstrapping, non-parametric bootstrapping, jackknifing, cross-validation, randomization tests, and permutation tests) and their…
Ram Sharma
  • 2,436
92
votes
7 answers

Euclidean distance is usually not good for sparse data (and more general case)?

I have seen somewhere that classical distances (like Euclidean distance) become weakly discriminant when we have multidimensional and sparse data. Why? Do you have an example of two sparse data vectors where the Euclidean distance does not perform…
shn
  • 2,959
92
votes
7 answers

How to efficiently manage a statistical analysis project?

We often hear of project management and design patterns in computer science, but less frequently in statistical analysis. However, it seems that a decisive step toward designing an effective and durable statistical project is to keep things…
chl
  • 53,725
92
votes
4 answers

Why not approach classification through regression?

Some material I've seen on machine learning said that it's a bad idea to approach a classification problem through regression. But I think it's always possible to do a continuous regression to fit the data and truncate the continuous prediction to…
Strin
  • 1,021
91
votes
10 answers

Why is it possible to get significant F statistic (p<.001) but non-significant regressor t-tests?

In a multiple linear regression, why is it possible to have a highly significant F statistic (p<.001) but have very high p-values on all the regressor's t tests? In my model, there are 10 regressors. One has a p-value of 0.1 and the rest are above…
Ηλίας
  • 1,569