1

I am working in biostatistics and often have the following conversation with medics: we are talking about some very interesting, but also very rare, disease/disorder (or side effect of a drug) until the point comes, where somebody says "the prevalence of this disease/side effect is only 5 per 1 Million people".

I always ask, how can they know the prevalence of such a rare disease - the confidence interval for an estimator using maximum likelihood must be huge? And a Bayesian approach is very likely to be biased, since clinicians will probably overestimate the relevance of their extremely rare disease. Unfortunately, so far nobody could provide me with an answer and this is really grinding my gears. Of course, it is obvious that for very dangerous and contagious diseases there are national databases, and hospitals are obliged to report any cases, so I am talking about genetic disorders in particular.

On the other hand, what you also see quite often, is that a disease or disorder is said to have an enormously high prevalence, which seems quite unlikely to me. For example, spina bifida is said to have a prevalence of 5% in newborns (Sandler 2010, doi:10.1016/j.pcl.2010.07.009). Since I and any other person I asked - besides medical doctors - had never heard of this disease before, I struggled (and still struggle) to believe that.

So, my question are:

  1. How do epidemiologists estimate the prevalence of a disease/disorder with a true prevalence of say 1 in 1,000,000 if there is no database containing all cases? (or are there countries with databases for every relevant disease/disorder?)
  2. What are common sizes of the corresponding confidence intervals? Are the estimators reliable at all?
  3. How can the prevalence of spina bifida be at 5% and still many people did not have ever heard about it, and how is this number determined?

Thanks a lot!

LuckyPal
  • 1,860
  • 1
    Spina bifida is not something that shows up on peoples faces. It occurs in the low back so it is usually only seen when a patient takes his (her) cloths off. Rare diseases are usually referred to a central area which has a known referral population. – Carl Dec 22 '17 at 22:10

1 Answers1

1
  1. This is the same problem faced by all statistical analyses when you do not have access to the entire population: take a random sample from the population, and base your estimates on the random sample. A major issue, of course, is how truly random and representative your sample is.

  2. There are several ways to estimate a binomial proportion confidence interval, of which a prevalence rate is one example. In general, the intervals are tighter for either very high or very low prevalence events, and tend to decrease in width with the square root of the number of cases in the sample. Reliability, beyond the limits imposed by binomial sampling, depends on how truly random and representative your sample is.

  3. The 5% incidence cited for spina bifida refers to a condition in which even those who have it are often unaware of it: spina bifida occulta, in which parts of the vertebrae did not completely develop but the defect is so mild that there are no obvious signs or damage. The severe forms, in which a sac with spinal fluid or even parts of the spinal cord extend outside of the vertebrae, are much less common. For the past 20 years, the US has required addition of folate to enriched grain products, which decreases the incidence of this and related types of developmental defects. See the Wikipedia page linked above, or the US CDC page for further information.

EdM
  • 92,183
  • 10
  • 92
  • 267
  • Thank you very much for your answer! I will read into the binomial proportion CI.

    1.+ 2. As you say, a central issue is how representative the sample is. However, nobody samples several million people to find the prevalence of some extremely rare disease (right?). How are these numbers produced then? By using case reports only?

    1. Thanks, that explains a lot! Sampling is obviously possible here. I don't see the sense of the prevalence if they include non pathological cases, but probably that's on me.
    – LuckyPal Dec 22 '17 at 22:47
  • 1
    @LuckyPal in this case the spina bifida occulta cases have an anatomical pathology but just don't have any serious consequences. For many diseases in the US, the CDC mandates reporting of all cases so that there is essentially complete coverage of the population. For other diseases, specialist medical societies or public interest groups can serve a similar function, particularly for very rare and devastating conditions. – EdM Dec 22 '17 at 22:56
  • I don't doubt that the incidence of infectious diseases is very well known, but I rather thought about other diseases. The meta analysis by Pringsheim et al. (2012), DOI: 10.1002/mds.25075 on the incidence and prevalence of Huntington's disease was very insightful to me. My conclusion is that doctors should be more careful when trying to describe prevalence or incidence with specific numbers... – LuckyPal Dec 22 '17 at 23:20