Should I balance data set for survival random forest? By subsampling I will loose information in data set. However I would do that in RF for classification. Should it be done also in case of survival analysis? I am not sure whether there is a conceptual difference.
Asked
Active
Viewed 272 times
1
-
What do you mean by survival random forest? – Itamar Mushkin Jul 27 '20 at 12:48
-
a random forest with survival object as response variable. It is trained with package randomForestSRC in R. – pikachu Jul 27 '20 at 12:55
-
1Don't balance, in neither case. Are unbalanced datasets problematic, and (how) does oversampling (purport to) help? – Stephan Kolassa Jul 27 '20 at 13:20
1 Answers
3
Don't balance, in neither case. Are unbalanced datasets problematic, and (how) does oversampling (purport to) help?
(Converted from a comment. For my rationale, see here. On short answers, see here. Better and longer answers are always welcome.)
Stephan Kolassa
- 123,354