1

I am trying to evaluate how well a disease test performs in a case-control study. In this example, the prevalence is 0.5%, and the results are below:

Disease + Disease -
Test + 40 (TP) 10 (FP)
Test - 600 (FN) 5000 (TN)

As the sample is enriched for those with the disease, the standard PPV calculation $PPV = \frac{TP}{TP+FP}$ isn't accurate. I can calculate the PPV adjusted for prevalence using $ppv = \frac{p\cdot Sens}{p\cdot Sens + (1-p)\cdot(1-Spec)}$, as in this page, which here gives a PPV of 13.6%.

But given the low prevalence, I am concerned that a small change in the number of false positives could make a large difference to the PPV, so I want to calculate the 95% confidence interval on this PPV.

This question gives the standard error as $SE = \sqrt{ \frac{PPV(1-PPV)}{TP+FP}}$. This would give a standard error of 4.8%, and if I then use $CI_{PPV} = PPV \pm 1.96*SE$, I get a CI of 4.1%-23.1%.

But:

  1. Does this equation still apply when the PPV has been adjusted for prevalence?

  2. I have read that as the uncertainty of a proportion is not symmetrical, using the above equation to calculate the CI is not very useful (e.g. in this question).

So, is there a better way to calculate the confidence interval on the PPV in this case?

1 Answers1

0

After looking into this further I found that there are multiple possible ways of calculating PPV confidence intervals, but decided to use the standard logit confidence intervals explained in and recommended by this paper: Confidence intervals for predictive values with an emphasis to case–control studies