Most Popular

1500 questions
76
votes
2 answers

Practical questions on tuning Random Forests

My questions are about Random Forests. The concept of this beautiful classifier is clear to me, but still there are a lot of practical usage questions. Unfortunately, I failed to find any practical guide to RF (I've been searching for something like…
lithuak
  • 1,013
76
votes
4 answers

Is standardization needed before fitting logistic regression?

My question is do we need to standardize the data set to make sure all variables have the same scale, between [0,1], before fitting logistic regression. The formula is: $$\frac{x_i-\min(x_i)}{\max(x_i)-\min(x_i)}$$ My data set has 2 variables,…
user1946504
  • 1,337
76
votes
5 answers

How small a quantity should be added to x to avoid taking the log of zero?

I have analysed my data as they are. Now I want to look at my analyses after taking the log of all variables. Many variables contain many zeros. Therefore I add a small quantity to avoid taking the log of zero. So far I've added 10^-10, without any…
miura
  • 3,684
76
votes
8 answers

Clustering with a distance matrix

I have a (symmetric) matrix M that represents the distance between each pair of nodes. For example, A B C D E F G H I J K L A 0 20 20 20 40 60 60 60 100 120 120 120 B 20 0 20 20 60 80 80 80 120 140 140…
yassin
  • 863
76
votes
11 answers

Is there any *mathematical* basis for the Bayesian vs frequentist debate?

It says on Wikipedia that: the mathematics [of probability] is largely independent of any interpretation of probability. Question: Then if we want to be mathematically correct, shouldn't we disallow any interpretation of probability? I.e., are…
Chill2Macht
  • 6,249
75
votes
1 answer

How to split the dataset for cross validation, learning curve, and final evaluation?

What is an appropriate strategy for splitting the dataset? I ask for feedback on the following approach (not on the individual parameters like test_size or n_iter, but if I used X, y, X_train, y_train, X_test, and y_test appropriately and if the…
tobip
  • 1,570
75
votes
7 answers

Rule of thumb for number of bootstrap samples

I wonder if someone knows any general rules of thumb regarding the number of bootstrap samples one should use, based on characteristics of the data (number of observations, etc.) and/or the variables included?
hoyem
  • 1,161
75
votes
6 answers

Do the predictions of a Random Forest model have a prediction interval?

If I run a randomForest model, I can then make predictions based on the model. Is there a way to get a prediction interval of each of the predictions such that I know how "sure" the model is of its answer. If this is possible is it simply based on…
75
votes
4 answers

Linear model with log-transformed response vs. generalized linear model with log link

In this paper titled "CHOOSING AMONG GENERALIZED LINEAR MODELS APPLIED TO MEDICAL DATA" the authors write: In a generalized linear model, the mean is transformed, by the link function, instead of transforming the response itself. The two methods …
miura
  • 3,684
75
votes
6 answers

Model for predicting number of Youtube views of Gangnam Style

PSY's music video "Gangnam style" is popular, after a little more than 2 months it has about 540 million viewers. I learned this from my preteen children at dinner last week and soon the discussion went in the direction of if it was possible to do…
FredrikD
  • 853
75
votes
4 answers

Standard error for the mean of a sample of binomial random variables

Suppose I'm running an experiment that can have 2 outcomes, and I'm assuming that the underlying "true" distribution of the 2 outcomes is a binomial distribution with parameters $n$ and $p$: ${\rm Binomial}(n, p)$. I can compute the standard error,…
Frank
  • 1,686
75
votes
7 answers

Do all interactions terms need their individual terms in regression model?

I am actually reviewing a manuscript where the authors compare 5-6 logit regression models with AIC. However, some of the models have interaction terms without including the individual covariate terms. Does it ever make sense to do this? For example…
djhocking
  • 1,931
75
votes
6 answers

Criticism of Pearl's theory of causality

In the year 2000, Judea Pearl published Causality. What controversies surround this work? What are its major criticisms?
Neil G
  • 15,219
75
votes
3 answers

What's the difference between feed-forward and recurrent neural networks?

What is the difference between a feed-forward and recurrent neural network? Why would you use one over the other? Do other network topologies exist?
Shane
  • 12,461
75
votes
7 answers

Which activation function for output layer?

While the choice of activation functions for the hidden layer is quite clear (mostly sigmoid or tanh), I wonder how to decide on the activation function for the output layer. Common choices are linear functions, sigmoid functions and softmax…
Funkwecker
  • 3,082