1

As far as I understand relative risks and hazard ratios are very similar concepts (although this paper seems to disagree). The advantage of hazard ratios is the inclusion of temporal information. I always thought that the Cox model at least theoretically requires that events and censoring can take place at any given moment so that time is necessarily a continuous variable. Obviously, time is usually discretised but this I considered okay as long as there are many (preferably equidistant) possible intervals at which an event or censoring can occur. The other day someone told me that there is no problem even if there is just one point in time in Cox regression. So my first question is: Is that true?

And if so shouldn't hazard ratio and relative risk be equal as there would be no additional temporal information? Here, I have an example with 150/200 events for placebo and 130/200 for verum with event and censoring time respectively the same for all patients.

> library(survival)
> d=data.frame(trt=(c(rep(1,200),rep(0,200))),event=c(rep(1,130),rep(0,70),rep(1,150),rep(0,50)),time=2)
> (cox.model=(coxph(Surv(time,event)~trt,data=d,ties="exact")))
Call:
coxph(formula = Surv(time, event) ~ trt, data = d, ties = "exact")
   coef exp(coef) se(coef)      z      p

trt -0.4784 0.6198 0.2203 -2.172 0.0299 > exp(confint(cox.model)) 2.5 % 97.5 % trt 0.4024838 0.9544346

Likelihood ratio test=4.77 on 1 df, p=0.02901 n= 400, number of events= 280

(The "exact" method for ties is correct right? "Efron" gives quite a different result of 0.7820 and "Breslow" is far from significant.) The relative risk on the other hand is this:

> meta::metabin(130,200,150,200,sm="RR")
Number of observations: o = 400
Number of events: e = 280
 RR           95%-CI     z p-value

0.8667 [0.7615; 0.9864] -2.17 0.0302

Interestingly, the odds ratio is much closer but also not equal:

     OR           95%-CI     z p-value
 0.6190 [0.4018; 0.9538] -2.17  0.0297

My second question is: Should HR and RR (or OR) be the same under these conditions?

diffset
  • 121

2 Answers2

1

The other day someone told me that there is no problem even if there is just one point in time in Cox regression. So my first question is: Is that true?

That depends on how you implement the Cox regression, as you illustrate. As discussed here, tied event times pose problems with Cox regression. The "exact partial likelihood" works OK with small data sets, but it requires "exhaustive enumeration of the possible risk sets at each tied death time, and can require a prohibitive amount of computation time." The Efron approximation to that exact partial likelihood is generally better than the Breslow approximation, but even it performs poorly at times when there is a very high ratio of ties to the size of the risk set.

The odds ratio and the risk ratio are different, although they can be similar if events are rare. See this web page or this Cross Validated page, for example. With a single time point the hazard ratio should match the odds ratio. Your values are so close that I suspect the difference is due to numerical issues from the multiple calculations associated with applying the "exact" method to a risk set of 400 individuals.

EdM
  • 92,183
  • 10
  • 92
  • 267
  • Thanks, I knew about the difference of OR and RR of course but HR is a more challenging concept. Thanks for the link to the discrete time question in Cox regression. I wonder why I didn't find it myself when I searched for questions related to mine before I posted it. – diffset Jun 05 '22 at 10:21
  • @diffset it can be hard to find what you want in a search when you don't already know the terminology in a field. For RR and OR, we try to write answers that will inform not only the questioner but also new readers of the question and answer, so I tend to add some critical things to answers even if I think that the questioner already knows them. A particular connection of Cox proportional-hazards continuous-time models to binomial models (with a complementary log-log link) over discrete time intervals is summarized here. – EdM Jun 05 '22 at 17:16
0

These terms and calculations are essentially convenient ways to bias results of studies in various directions to support the hoped-for outcomes.

Example - Relative risk... Pharma Co. X produces its latest/greatest FLU vaccine, and needs some data to bring it to market.

So - it recruits 2000 healthy volunteers. 1000 are given vax. 1000 are not. Company exposes all to FLU virus.

1 of vaxed group tests positive for flu. 2 of non-vaxed test positive.

THE ABSOLUTE difference is OBVIOUSLY 1/1000 (.001). In other words - zero statistical difference.

This sells ZERO vaccine doses. Data must be spun.

SO -- Enter RELATIVE RISK!! ONE vaxed person has flu. TWO non-vaxed... OH -- that's a 50% difference. VOILA.. - 50% RELATIVE RISK reduction with the vax.

Totally corrupt - but totally acceptable for their marketing.

  • RR is a very useful concept when e.g, used where only a portion of the subjects are actually exposed to the risk like in your vaccination example. (Subjects are usually not injected with the virus in those situations but are possibly exposed to it like everyone else by coming into contact with it in real life.) And contrary to what you seem to imply RR can't be used to fake significance. Indeed, in your example neither risk difference (RD, CI, p): -0.001 [-0.004; 0.002] 0.5634 nor relative risk (RR, CI, p): 0.500 [0.046; 5.505] 0.5712 is anywhere near significance with very similar p-values. – diffset Feb 12 '23 at 11:14