“Prevalence is to the diagnostic process as gravity is to the
solar system – it has the power of a physical law.” - Clifton K.
Meador, A Little Book of Doctors’ Rules
Diagnosis is the apotheosis of medical skill1. Without
an accurate diagnosis, the patients’ foremost questions cannot be
answered: what is wrong, what is likely to happen, and what can be done
about it? Central to the diagnostic process is the compilation of a list
of possible causes of the clinical presentation – a differential
diagnosis. Traditionally, diseases comprising the differential diagnosis
were rank ordered from highest to lowest in terms of
likelihood2,3. This method, which is inconsistently
used, is loosely probabilistic but can obscure large differences in
probability between components of the differential; an ideal method
would assign probabilities to each possibility that sum to
100%4. Practical difficulties limit the ability to
accurately estimate the probabilities of components of the
differential5-7, yet probabilistic reasoning is
nonetheless an essential part of expert forecasting and
diagnosis8-16.
Several well-known clinical axioms pay homage to the primacy of
probability in diagnosis. The most popular, “common things are common”
(CTC), has been part of medical folklore for more than a
century1,17. It is often expressed as a metaphor:
“when you hear hoofbeats, look for horses not
zebras”18-20. The CTC axiom is rooted in base rates
and Bayes’ theorem and is the foundation of other epigrams such as
“uncommon presentations of common diseases are more common than common
presentations of uncommon diseases.” The CTC axiom and its variants
arose to combat what are now known as cognitive biases such as base rate
neglect21,22 and the representativeness
heuristic23,24 but are themselves heuristics –
rudimentary rules of thumb that provide general rather than specific
guidance for the consideration of probability in
diagnosis1.
Ironically, the axiom and extant literature are silent on how to
determine what is common, what is rare, and related questions. Is
sarcoidosis common? Is commonness related to epidemiological metrics
such incidence and prevalence, or is it an intuitive or experiential
determination, or both? Should diseases be dichotomized as either common
or rare, or rated on a frequency continuum? Can the notion of commonness
be operationalized as an aid to estimating probabilities of components
of a differential diagnosis in a practicable way?
At first blush, discriminating between common and rare diseases is a
simple task: pneumonia is common, and lymphangioleiomyomatosis (LAM) is
rare; a physician will encounter many cases of the former and few (if
any) of the latter in any given time interval. This self-evident truth
is a distillation of the physician’s experience into the coarse
dichotomy of common and rare; it obfuscates the magnitude of the
difference in the frequency of the two diseases. How much more common is
pneumonia than LAM? The dichotomy does not permit an answer to this
question. Ideally, we would like to know the relative likelihoods of
diseases so that, ceteris paribus, their weight in the differential
diagnosis could be made proportional to their observed frequencies.
Fortunately, dichotomization is unnecessary. The actual frequencies of
diseases can be used as the metric for comparison of their commonness.
First, it is necessary to determine which measure of epidemiological
disease frequency - incidence or prevalence - should be used. Incidence
is the number of new cases diagnosed per person per year, whereas
prevalence is the number of existing cases already diagnosed per
person at a given time point. (Incidence is customarily expressed as
cases/100,000 person-years, and prevalence as cases per 100,000 persons;
prevalence is, roughly, the product of incidence and the average
duration of the disease.) Incidence and prevalence are similar when a
disease has a high recovery or short-term mortality rate (e.g.,
pneumonia), since death or recovery removes cases from the numerator.
Prevalence is higher than incidence when the short-term mortality and
recovery rates are low (e.g., emphysema) because chronic cases
accumulate in the population, growing the numerator. Because incidence
relates to new or previously undiagnosed cases and
prevalence to existing or already diagnosed cases, it is
incidence that germane to the diagnostician. Therefore, the epigraph
from A Little Book of Doctors’ Rules requires modification –
incidence, not prevalence, has the power of a physical
law25.
This distinction, until now neglected in the vast literature on
diagnosis and clinical reasoning, is paramount because the prevalence of
many diseases is higher than their incidence, sometimes markedly so. The
probability that a patient seen tomorrow will present with previously
undiagnosed symptomatic hypothyroidism is related to the incidence of
hypothyroidism and most practitioners will go a month or longer without
diagnosing a new case of (incident) hypothyroidism. By contrast, the
probability that a patient with established (prevalent) hypothyroidism
will be seen on an average day is high. The physician does not
“diagnose” these cases of prevalent disease – they have already been
diagnosed. For already diagnosed chronic diseases prone to complications
or flare-ups, such as systemic lupus erythematosus (SLE), the prevalence
of lupus will affect the probability of seeing the complications, as the
latter are conditional upon the former26. However,
when diagnosing complications of SLE, it is already known that SLE is
present; therefore, the prevalence of SLE is immaterial. It is theincidence of the complication that relates the probability of the
complication, given SLE.
If, as seems likely, notions of disease commonness are based on how
frequently patients with the disease are encountered without regard to
whether they represent new or existing diagnoses, the resulting amalgam
of incident and prevalent cases will bias intuitions about what is
common, making many diseases appear to be more common than they are.
Referral bias and clinicopathological conferences may similarly skew
intuitions about disease incidence, since patients with rare diseases
are concentrated in these samples compared to unselected
patients15,27,28. Indeed, clinicopathological
conferences and grand rounds customarily select the rarest diseases for
presentation, turning the natural order of disease frequency
topsy-turvy28.
Fortunately, commonness need not be based upon intuitions: because of
the proliferation of epidemiological cohort data in recent decades, the
incidence of most diseases can now be readily found in epidemiological
cohort series. Similarly, online resources such as
www.uptodate.com commonly report disease
incidences under a subsection on epidemiology. Estimates from these
sources are not always in agreement, but the problems posed by
variability are not as serious as they may seem; precise incidences are
not necessary. Worthwhile comparisons can be made based on order of
magnitude differences in incidence estimates for different diseases.
For example, Table 1 shows that the incidence of pneumonia is
approximately 650 cases per 100,000 persons-years. By comparison, that
of segmental pulmonary embolism is on the order of 60 per 100,000
person-years. (UpToDate was used as the default source for incidence
data for the sake of simplicity and ease of use and to limit the size of
the bibliography.) Suppose we were to make a differential diagnosis for
dyspnea in a patient presenting to the emergency department before any
individuating information about the illness was known that would allow
us to differentiate between pneumonia and pulmonary embolism. (A
scenario such as this occurs countless times each day as emergency room
physicians approach a patient’s room with nothing more than age, gender,
and chief complaint recorded by the triage nurse on the intake sheet.)
Based on incidence alone, we could say that pneumonia is an order of
magnitude more likely than pulmonary embolism; it is more than five
orders of magnitude more likely than lymphangiomyomatosis. Indeed,
diseases with incidences of less than 1/100,000 person-years are so rare
that most clinicians outside of specialized referral centers will
diagnose new (incident) cases on average no more than once or twice
during their entire career14.