Highest Voted Questions - Statistical Analysis Stack Exchange

36

votes

5 answers

How do I use the SVD in collaborative filtering?

I'm a bit confused with how the SVD is used in collaborative filtering. Suppose I have a social graph, and I build an adjacency matrix from the edges, then take an SVD (let's forget about regularization, learning rates, sparsity optimizations, etc),…

asked Jun 25 '12 at 20:36

Vishal

1,200

36

votes

1 answer

Differences between a statistical model and a probability model?

Applied probability is an important branch in probability, including computational probability. Since statistics is using probability theory to construct models to deal with data, as my understanding, I am wondering what's the essential difference…

asked Jun 23 '12 at 18:40

Honglang Wang

945

36

votes

1 answer

Mathematical differences between GBM, XGBoost, LightGBM, CatBoost?

There exist several implementations of the GBDT family of model such as: GBM XGBoost LightGBM Catboost. What are the mathematical differences between these different implementations? Catboost seems to outperform the other implementations even by…

boosting

asked Oct 12 '17 at 11:43

Metariat

2,526
4
24
43

36

votes

5 answers

Can you overfit by training machine learning algorithms using CV/Bootstrap?

This question may well be too open-ended to get a definitive answer, but hopefully not. Machine learning algorithms, such as SVM, GBM, Random Forest etc, generally have some free parameters that, beyond some rule of thumb guidance, need to be tuned…

asked May 29 '12 at 03:04

Bogdanovist

6,619

36

votes

7 answers

How to generate numbers based on an arbitrary discrete distribution?

How do I generate numbers based on an arbitrary discrete distribution? For example, I have a set of numbers that I want to generate. Say they are labelled from 1-3 as follows. 1: 4%, 2: 50%, 3: 46% Basically, the percentages are probabilities that…

distributions

asked Apr 20 '12 at 23:13

FurtiveFelon

531

36

votes

2 answers

Calculate Transition Matrix (Markov) in R

Is there a way in R (a built-in function) to calculate the transition matrix for a Markov Chain from a set of observations? For example, taking a data set like the following and calculate the first order transition…

asked Apr 19 '12 at 00:02

B_Miner

8,630

36

votes

6 answers

Sample size for logistic regression?

I want to make a logistic model from my survey data. It is a small survey of four residential colonies in which only 154 respondents were interviewed. My dependent variable is "satisfactory transition to work". I found that, of the 154 respondents,…

asked Apr 07 '12 at 07:38

Braj-Stat

611

36

votes

3 answers

What is "baseline" in precision recall curve

I'm trying to understand precision recall curve, I understand what precision and recall are but the thing I don't understand is the "baseline" value. I was reading this link…

asked Dec 12 '16 at 19:53

hyeri

461

36

votes

4 answers

LASSO with interaction terms - is it okay if main effects are shrunk to zero?

LASSO regression shrinks coefficients towards zero, thus providing effectively model selection. I believe that in my data there are meaningful interactions between nominal and continuous covariates. Not necessarily, however, are the 'main effects'…

asked Nov 07 '16 at 19:41

tomka

6,572

36

votes

5 answers

Clustering methods that do not require pre-specifying the number of clusters

Are there any "non-parametric" clustering methods for which we don't need to specify the number of clusters? And other parameters like the number of points per cluster, etc.

clustering

asked Oct 20 '16 at 13:51

Learn_and_Share

866
1
10
18

36

votes

2 answers

How to use ordinal logistic regression with random effects?

In my study I will be measuring workload with several metrics. With heart-rate variability (HRV), electrodermal activity (EDA) and with a subjective scale (IWS). After normalization the IWS has three values: Workload lower than normal Workload is…

asked Oct 05 '16 at 12:03

Robin Kramer-ten Have

639
2
6
15

36

votes

1 answer

What are the properties of a half Cauchy distribution?

I am currently working on a problem, where I need to develop a Markov chain Monte Carlo (MCMC) algorithm for a state space model. To be able to solve the problem, I have been given the following probability of $\tau$: p($\tau$) =…

asked Oct 01 '16 at 02:16

Christoph

365
1
3
4

36

votes

4 answers

Why is it important to include a bias correction term for the Adam optimizer for Deep Learning?

I was reading about the Adam optimizer for Deep Learning and came across the following sentence in the new book Deep Learning by Begnio, Goodfellow and Courtville: Adam includes bias corrections to the estimates of both the first-order moments…

asked Aug 31 '16 at 20:47

Charlie Parker

6,866

36

votes

6 answers

Why do some people use -999 or -9999 to replace missing values?

I have a dataset. There are lots of missing values. For some columns, the missing value was replaced with -999, but other columns, the missing value was marked as 'NA'. Why would we use -999 to replace the missing value?

missing-data

asked Jul 22 '16 at 19:47

qqqwww

503

36

votes

1 answer

Cross-validation misuse (reporting performance for the best hyperparameter value)

Recently I have come across a paper that proposes using a k-NN classifier on an specific dataset. The authors used all the data samples available to perform k-fold cross validation for different k values and report cross validation results of the…

asked Jul 18 '16 at 10:48

Daniel López

5,646

Most Popular