2

Reading about t-SNE, and looking at the pretty plots, it seems to be very good at separating things that we "expect" to be separate in low dimensions. Why wouldn't we use this to do dimensionality reduction before using some sort of classification algorithm (something data-hungry like a DNN for example)?

EDIT: to rephrase and generalize slightly further, since t-SNE preserves separatbility so well when it is nicely tuned, why not go for t-SNE in 2 or 3-D and then a nonlinear classifier, instead of a standard $k$-dimension reduction like PCA or ICA?

Dave
  • 62,186
  • Possible duplicate: https://stats.stackexchange.com/questions/263539/k-means-clustering-on-the-output-of-t-sne – generic_user Oct 26 '17 at 13:10
  • 1
    I think this is a little different, just because KMM for example is a generative approach, and as the answer well points out much information is lost in terms of density. But if separation is preserved, then why not run a complicated discriminative approach? – bibliolytic Oct 26 '17 at 13:24
  • IDK, is separation truly preserved? – generic_user Oct 26 '17 at 13:44
  • Heuristically speaking I'd say yes, to look at van Maaten's papers – bibliolytic Nov 01 '17 at 16:01
  • "something data-hungry like a DNN for example" neural nets don't mind high dimensional space, but if you want to use a projection (to make it interpretable), you might as well let the nnet pick which one to use: https://arxiv.org/abs/1902.10527 – John Madden Apr 05 '23 at 13:19

1 Answers1

1

I agree that this sounds like a great idea. Indeed, some of my t-SNE plots have shown great separation of the data in just two dimensions. The trouble is that t-SNE gives no ability to map new points into the low-dimensional space, unlike, for instance, PCA, which gives you a linear transformation to apply to new points. Thus, you cannot make predictions on new data or even perform standard techniques like using test sets, cross validation, or bootstrap validation.

Dave
  • 62,186