I am studying the effect of expression of a specific gene signature on patient survival, dividing patients into Low, Mid, and High expression of Signature X. The survival curves for these patients all look very different from each other, and I'm trying to get a statistical value to support this statement (patients with Low, Mid and High expression have different survival trends).
I used the coxph function in the survival package in R and obtained this:
> coxph(Surv ~ Sig.X.Group, data = data)
coef exp(coef) se(coef) z p
Sig.X.Group 1.0310 2.8039 0.0795 13 <2e-16
If I had divided patients into Low and High expression of Sig.X I understand that: "High" expression of Sig.X is associated with a 2.8 increase in risk of death compared to "Low" expression (or something along those lines).
But what is the English explanation to an HR of 2.8 when I run the coxph function on 3 groups rather than 2?
EDIT: Fixed. The problem was my Sig.X.Group was coded as a number. Fixed the issue when changed Sig.X.Group to a factor, which gave me two individual HRs:
coef exp(coef) se(coef) z Pr(>|z|)
Sig.X.Group2 0.5821074 1.789806 0.1696 3.4318 0.0005993
Sig.X.Group3 1.9186438 6.811714 0.1550 12.3753 0.0000000
Sig.X.Groupcoded as a numeric variable. E.g., groups called "1","2", and "3" will tend to be dealt with as numeric values instead of factor levels unless specified otherwise. Make sure thatSig.X.Groupis coded as a factor before you callcoxph. Also,summary(coxph())provides more useful information. – EdM Jan 30 '18 at 21:48