11

An odds is the ratio of the probability of an event to its complement:

$$\text{odds}(X) = \frac{P(X)}{1-P(X)}$$

An odds ratio (OR) is the ratio of the odds of an event in one group (say, $A$) versus the odds of an event in another group (say, $B$):

$$\text{OR}(X)_{A\text{ vs }B} = \frac{\frac{P(X|A)}{1-P(X|A)}}{\frac{P(X|B)}{1-P(X|B)}}$$

A probability ratio1 (PR, aka prevalence ratio) is the ratio of the probability of an event in one group ($A$) versus the probability of an event in another group ($B$):

$$\text{PR}(X)_{A\text{ vs }B} = \frac{P(X|A)}{P(X|B)}$$

An incidence proportion can be thought of as pretty similar to a probability (although technically is a rate of probability occurring over time), and we contrast incidence proportions (and incidence densities, for that matter) using relative risks (aka risk ratios, RR), along with other measures like risk differences:

$$\text{RR}_{A\text{ vs }B} = \frac{\text{incidence proportion}(X|A)}{\text{incidence proportion}(X|B)}$$

Why are relative probability contrasts so often represented using relative odds instead of probability ratios, when risk contrasts are represented using relative risks instead of odds ratios (calculated using incidence proportions instead of probabilities)?

My question is foremost about why prefer ORs to PRs, rather than why not use incidence proportions to calculate a quantity like an OR. Edit: I am aware that risks are sometimes contrasted using a risk odds ratio.

1 As near as I can tell… I do not actually encounter this term in my discipline other than very rarely.

Alexis
  • 29,850
  • 4
    A few thoughts... logistic regression produces conditional ORs, and it's a common model for binary data. Other models exist and could be used instead but rarely are in practice. Also, for case-control studies (matching based on outcome), only ORs can be validly estimated. Using ORs for prospective studies makes results comparable. ORs are also symmetric: changing 1s to 0 and 0s to 1 give the inverse of the OR; not so for RR. – Noah Feb 25 '20 at 02:00
  • 1
    @Noah I am not sure I follow your point about case-control studies: if one can calculate OR of exposure case vs controls, one can certainly calculate PR of exposure cases versus controls: they are simply different representations of the same data. Do you mean that under certain circumstances, ORs in case-control studies can be used to approximate RR? (Thank you for your thoughts!) – Alexis Feb 25 '20 at 03:40
  • @Noah "Using ORs for prospective studies makes results comparable." So does using RRs and, for that matter, RDs, but the latter, being risk comparisons are generally preferred. No, I am asking why one uses an OR and not a PR (or not also a PR). For example, with cross-sectional data. – Alexis Feb 25 '20 at 03:40
  • @Noah "ORs are also symmetric: changing 1s to 0 and 0s to 1 give the inverse of the OR; not so for RR.", yes but I am asking about PR, still the math holds there. OTOH: ORs are more or less uninterpretable (in the sense of poor intuitively graspable meaning) as compared with PRs, and PDs, something which the literature lands on... so why aren't PRs used nearly as often? Is it simply that logistic regression is such a useful too, and there is not comparable regression for estimating raw or relative probabilities? – Alexis Feb 25 '20 at 03:40
  • 4
    Here is a nice summary of some of these points. RR and PR are invalid in case-control studies; only the OR is valid. If you want to compare the results across different types of studies, putting them all on a valid measure (i.e., the OR) makes that straightforward. I agree that for other designs RR, PR, or RD make far more sense. – Noah Feb 26 '20 at 05:18
  • 4
    Also, you may be assuming researchers are acting in a rational and fully informed way. In reality, many researchers do not know statistics well enough to compute the RR from a logistic regression because it doesn't correspond to any of the estimated parameters. People do what they know even if it's suboptimal. – Noah Feb 26 '20 at 05:19
  • 1
    @Noah Why is PR invalid in a case-control study? One can certainly calculate $\text{PR}=\frac{P(E|\text{case})}{P(E|\text{control})}$, but why is that less preferable to an OR? Simply because of symmetry? Why does Symmetry trump straightforward interpretability (i.e. "The probability of prior exposure among cases was PR× the probability among controls")? (Obviously RR is invalid: there is no measure of change in exposure, or of change in outcome in this design.) – Alexis Feb 26 '20 at 16:11
  • Orthogonal parametrizations are convenient in general - the sample odds ratio remains a good estimate of the population odds ratio whatever you might later learn about the prevalence of the condition. Related is that marginal totals are approximately ancillary for the population odds ratio, providing grounds for exact conditional inference from small samples. (Unrelated is that the ratio of two parts isn't a more complex relation than the ratio of a part to the whole; so the interpretability of odds is perhaps more a matter of personal familiarity than ... – Scortchi - Reinstate Monica Mar 05 '20 at 14:58
  • ... of anything else - if you've spent a bit of time at the tracks you probably have an good enough grasp of them.) All the same, the "parameter of interest" has to be defined according to the problem at hand - point & interval estimates for the difference between two proportions might be what's needed, to multiply by cost per case, say. – Scortchi - Reinstate Monica Mar 05 '20 at 14:59
  • @Scortchi-ReinstateMonica "Orthogonal parametrizations are convenient in general - the sample odds ratio remains a good estimate of the population odds ratio whatever you might later learn about the prevalence of the condition." Are you implying that the sample PR does not remain a good estimate of the population PR here? – Alexis Mar 05 '20 at 17:30
  • @Scortchi-ReinstateMonica "marginal totals are approximately ancillary for" I am way unfamiliar with the way you are using these specific words. I appreciate your response, but am struggling with this part. (I think the literature into numeracy disagrees pretty strongly with you about the interpretability of probability verse odds, and PRs vs ORs, but I can set that aside. :) The value of PD vs PR is good (and I am pretty comfortable thinking about the relative merits of RD vs RR). – Alexis Mar 05 '20 at 17:32
  • 1
    Yes - if you parametrize the problem with an odds ratio & one to three nuisance parameters (according to the sampling scheme), the covariance of the maximum-likelihood estimate of the odds ratio & that of each nuisance parameter is nought; the ML estimate of the odds ratio is asymptotically independent of those of the others & insensitive to their true value. I think that's one of the main reasons for its popularity (another, following on from @Noah's observation, being that you can compare estimates of the same parameter from different studies using different sampling schemes), ... – Scortchi - Reinstate Monica Mar 06 '20 at 08:59
  • 1
    ... though it may well be trumped by countervailing considerations, as I noted. (I ought to write an answer; if no-one else does I shall, when I'm back off holiday.) – Scortchi - Reinstate Monica Mar 06 '20 at 09:01
  • Sorry, I'm long since back off holiday, but have no time owing to the current goings on. Just wanted to correct the impressions I may have given that the parametrization with odds ratio as the parameter of interest is the only orthogonal one - it's the only one in which the parameter of interest is orthogonal to prevalance parameters. – Scortchi - Reinstate Monica Apr 18 '20 at 14:04

2 Answers2

2

I think the reason that OR is far more common that PR comes down to the standard ways in which different types of quantity are typically transformed.

When working with normal quantities, like temperature, height, weight, then the standard assumptions is that they are approximately Normal. When you take contrasts between these sorts of quantities, then a good thing to do is take the difference. Equally if you fit a regression model to it you don't expect a systematic change in the variance.

When you are working with quantities that are "rate like", that is they are bounded at zero and typically come from calculating things like "number per day", then taking raw differences is awkward. Since the variance of any sample is proportional to the rate, the residuals of any fit to count or rate data won't generally have constant variance. However, if we work with the log of the mean, then the variances will be "stabilized" – that is they add rather than multiply. Thus for rates we typically handle them as the log. Then when you form contrasts you are taking differences of logs, and that is the same as a ratio.

When you are working with probability like quantities, or fractions of a cake, then you are now bounded above and below. You now also have an arbitrary choice what you code as 1 and 0 (or more in multi-class models). Differences between probabilities are invariant to switching 1 to 0, but have the problem of rates that the variance changes with the mean again. Logging them wouldn't give you invariance for 1s and 0s, so instead we tend to logit them (log-odds). Working with log-odds you are now back on the full real line, the variance is the same all along the line, and differences of log-odds behave a bit like normal quantities.

Gaussian

  • Variance does not depend on $\mu$
  • Canonical link for GLM is $x$
  • Transformation not helpful

Poisson

  • Variance is proportional to the rate $\lambda$
  • Canonical link for GLM is $\ln(x)$
  • Logging should result in residuals of constant variance

Binomial

  • Variance is proportional to $p(1-p)$
  • Canonical link for GLM is logit $\ln\left(\frac{p}{1-p}\right)$
  • Taking logit (log-odds) of data should result in residuals of constant variance

So I think that the reason you see lots of RR, but very little PR is that PR is constructed from probability/Binomial type quantities, while RR is constructed from rate type quantities. In particular note that incidence can exceed 100% if people can catch the disease multiple times per year, but probability can never exceed 100%.

Is odds the only way?

No, the general messages above are just useful rules of thumb, and these "canonical" forms are just convenient mathematically – hence why you tend to see it most. The probit function is used instead for probit regression, so in principle differences of probit would be just as valid as OR. Similarly, despite best efforts to word it carefully, the text above still sort of suggests that logging and logiting your raw data, and then fitting a model to it is a good idea – it's not a terrible idea, but there are better things that you can do (GLM etc.).

Alexis
  • 29,850
Corvus
  • 5,345
  • @Alexis I don't understand - my point is that PR and OR both relate to probabilities, not risks, so in a cross-sectional study you could calculate the OR? In a cross sectional study you will sample some people, so you are going to end up working with odds ratios naturally? – Corvus Mar 06 '20 at 17:56
  • @Alexis, perhaps what is confusing me here, is why did you bring up RR if you don't want any comment on RR? As I (mis?) understood your question your point was why don't people make something that looks like RR, but using probabilities. But my answer is simply that probabilities and rates are not the same thing, just because they looks similar they aren't, so PR isn't really getting your closer to RR in any formal way. Probabilities would normally be viewed from a Binomial distribution, but rates are coming from a Poisson distribution. – Corvus Mar 06 '20 at 17:58
  • @Alexis because log odds is the variance stabilization for probabilities? I suppose I'm struggling to see why you single out probability ratio? – Corvus Mar 06 '20 at 18:55
  • @Alexis I'm only mentioning RR the same way you do in the question - you seem to use it to justify PR being a thing, but Im saying the RR isn't the same because it is risks not probabilities, so that motivation is aesthetic not statistical. We use ratio of risks because they are Poisson, but ratio of odds for things that are Binomial. – Corvus Mar 06 '20 at 19:25
  • 1
    @Alexis ok, i think i understand a bit more what you want, I'll redraft later – Corvus Mar 06 '20 at 19:48
  • 1
    @Alexis, is the redraft no better? – Corvus Mar 18 '20 at 15:50
2

Underlying models for probabilities

Odds relate well to logistic models

$$p = \frac{1}{1 + e^{-(a+bx)}}$$

Probability relates well to exponential models

$$p = e^{a+bx}$$

Comparison

Let's see how the curves of these models compare to each other in the images below.

  • For small values of $p$ the difference between odds and probability is not so large. It is at larger values of $p$ that the $(1-p)$ term in the denominator of the odds expression becomes important.

  • The log probabilities are linear for the exponential model.

  • The log odds are linear for the logistic model.

    • The linearity for log-odds means that the change in the odds ratio is constant for a change in $x$. If the probability follows a logistic model, then $\frac{odds(x)}{odds(x+\Delta)}$ is independent of $x$ and depends only on the size of the change $\Delta$.

      Thus with logistic models, a change of the parameter $x$ by some step $\Delta$ means the same change in log-odds creates the same odds-ratio, independent of $x$.

plots

Why odds

Logistic models are more typical (or related shapes like logit models). This makes comparisons with differences in log-odds (or equivalent ratios of odds) an intuitive way to express changes.

But for small probabilities, the odds ratio's and probability ratio's are very similar.

$$ \frac{odds(x)}{odds(y)} = \frac{p_x/(1-p_x)}{p_y/(1-p_y)} = \frac{p_x}{p_y} \frac{1-p_y}{1-p_x} \approx \frac{p_x}{p_y} $$