0

I was wondering if anyone would be able to help me understand whether the proportional hazard assumption is met or not.

For context I have a large dataset with >10 million observations (up to 40 observations for >27k distinct firms ). I have quarterly info on whether the firm failed or not in counting process style (each observation is a quarter and states if the firm failed in that quarter or not, up until failure or censoring (end of data collection or if firm has reached 40 quarters));

My independent variables are team diversity for gender, age & nationality, as well as team size. I also control for industry and country with dummies, these are static variables.

When checking the proportional hazard assumption I have used several methods, which lead me to different conclusions.

As a note: I am using STATA as my statistical software program.

Using scaled schoenfeld residuals:

enter image description here

However, since I have a large dataset, I also want to check it graphically as I have read some places online that the large sample size can influence the significance (eg. Test Cox proportional hazard assumption (Bad Schoenfeld residuals))

Graphically for my independent variables (diversity & team size), however, the line is horizontal (image has a horizontal line on x axis to compare with) (results for all variables very similar to the below image). This would lead me to assume the assumption is met.enter image description here

However, when I add interactions with time (using tvc option in STATA), I find significant results, which would mean the assumption is not met.

enter image description here

Any advice? I'm still a novice when it comes to survival analysis.

Laura Hill
  • 45
  • 6

0 Answers0