I'm currently training a model for spam detection on a training set with 4757 observations with 3000 variables counting word frequencies. It is taking forever and I have a deadline coming up so I wanted to just run the code by you guys to ensure I'm not waiting for nothing.

Any advice would be much appreciated.
n=4757,p=3000I would suggest that you would be better off using a penalized regression viaglmnet- there may be some more startup cost in figuring out how to use the machinery, but it will be better (and probably faster) in the long run. – Ben Bolker May 23 '22 at 17:05glmnetis evidently supported bycaret. There's also a Firth penalization of logistic regression models implemented in the Rlogistfpackage. I don't know if that's directly supported bycaretbut I understand that you can write your own methods to call that. – EdM May 23 '22 at 19:02