1

How do procedures such as Principal Component Analysis, Logistic Regression, Cross Validation perform under Zero under Zero Heavy Data? Are they sub-optimal or simply inadequate?

stats_noob
  • 1
  • 3
  • 32
  • 105
  • 1
    That will very much depend on your specific problem. PCA, logistic regression and cross validation are very different techniques used to answer very different questions, after all. "Zero heavy data" could refer to "many" zeros in either your independent or your dependent variable. Perhaps you could be a little more specific in your question? – Stephan Kolassa Nov 03 '15 at 21:39
  • I have been very confused about this topic - I think my understanding was clearer at the start of all this. According to literature I have read, it states that overdispersion (zero heavy data) is more of a problem in the response variable instead of the covariates? – stats_noob Nov 05 '15 at 00:22
  • If you have "too" many zeros, then you may want to look at zero-inflated models (e.g., zero-inflated Poisson or negbin), or hurdle models. Right now, your question is a bit too vague for us to be able to help you. – Stephan Kolassa Nov 05 '15 at 09:18

0 Answers0