Most Popular

1500 questions
37
votes
2 answers

Can PCA be applied for time series data?

I understand that Principal Component Analysis (PCA) can be applied basically for cross sectional data. Can PCA be used for time series data effectively by specifying year as time series variable and running PCA normally? I have found that dynamic…
Nisha Simon
  • 521
  • 1
  • 6
  • 5
37
votes
5 answers

Is p-value essentially useless and dangerous to use?

This article "The Odds, Continually Updated" from NY Times happened to catch my attention. To be short, it states that [Bayesian statistics] is proving especially useful in approaching complex problems, including searches like the one the Coast…
SixSigma
  • 2,292
37
votes
2 answers

When is t-SNE misleading?

Quoting from one of the authors: t-Distributed Stochastic Neighbor Embedding (t-SNE) is a (prize-winning) technique for dimensionality reduction that is particularly well suited for the visualization of high-dimensional datasets. So it sounds…
37
votes
4 answers

What is the weak side of decision trees?

Decision trees seems to be a very understandable machine learning method. Once created it can be easily inspected by a human which is a great advantage in some applications. What are the practical weak sides of Decision Trees?
Łukasz Lew
  • 1,412
37
votes
1 answer

Equivalence between least squares and MLE in Gaussian model

I am new to Machine Learning, and am trying to learn it on my own. Recently I was reading through some lecture notes and had a basic question. Slide 13 says that "Least Square Estimate is same as Maximum Likelihood Estimate under a Gaussian model".…
Andy
  • 1,683
37
votes
2 answers

Probability inequalities

I am looking for some probability inequalities for sums of unbounded random variables. I would really appreciate it if anyone can provide me some thoughts. My problem is to find an exponential upper bound over the probability that the sum of…
Farzad
  • 575
37
votes
3 answers

Satterthwaite vs. Kenward-Roger approximations for the degrees of freedom in mixed models

The lmerTest package provides an anova() function for linear mixed models with optionally Satterthwaite's (default) or Kenward-Roger's approximation of the degrees of freedom (df). What is the difference between these two approaches? When to choose…
doko
  • 471
37
votes
3 answers

Is it possible to change a hypothesis to match observed data (aka fishing expedition) and avoid an increase in Type I errors?

It is well known that researchers should spend time observing and exploring existing data and research before forming a hypothesis and then collecting data to test that hypothesis (referring to null-hypothesis significance testing). Many basic…
post-hoc
  • 697
  • 1
  • 6
  • 15
36
votes
4 answers

Checking if two Poisson samples have the same mean

This is an elementary question, but I wasn't able to find the answer. I have two measurements: n1 events in time t1 and n2 events in time t2, both produced (say) by Poisson processes with possibly-different lambda values. This is actually from a…
Charles
  • 1,238
36
votes
2 answers

Relative importance of a set of predictors in a random forests classification in R

I'd like to determine the relative importance of sets of variables toward a randomForest classification model in R. The importance function provides the MeanDecreaseGini metric for each individual predictor--is it as simple as summing this across…
36
votes
3 answers

Datasets constructed for a purpose similar to that of Anscombe's quartet

I've just come across Anscombe's quartet (four datasets that have almost indistinguishable descriptive statistics but look very different when plotted) and I am curious if there are other more or less well-known datasets that have been created to…
Hibernating
  • 3,943
36
votes
5 answers

Overfitting a logistic regression model

Is it possible to overfit a logistic regression model? I saw a video saying that if my area under the ROC curve is higher than 95%, then its very likely to be over fitted, but is it possible to overfit a logistic regression model?
36
votes
1 answer

Comparing hierarchical clustering dendrograms obtained by different distances & methods

[The initial title "Measurement of similarity for hierarchical clustering trees" was later changed by @ttnphns to better reflect the topic] I am performing a number of hierarchical cluster analyses on a dataframe of patient records (e.g. similar to…
Wouter
  • 2,192
36
votes
3 answers

PCA on correlation or covariance: does PCA on correlation ever make sense?

In principal component analysis (PCA), one can choose either the covariance matrix or the correlation matrix to find the components (from their respective eigenvectors). These give different results (PC loadings and scores), because the eigenvectors…
Lucozade
  • 659
36
votes
2 answers

What's the difference between "deep learning" and multilevel/hierarchical modeling?

Is "deep learning" just another term for multilevel/hierarchical modeling? I'm much more familiar with the latter than the former, but from what I can tell, the primary difference is not in their definition, but how they are used and evaluated…
user4733
  • 2,714