cheat sheet on terminology of medical diagnostic testing


              \  the true situation
               \
                \    +       -
                 +-------+-------+---
                 |       |       |
               + |   a   |   b   |  a+b
what the         |       |       |
diagnostic       +-------+-------+---
test returns     |       |       |
               - |   c   |   d   |  c+d
                 |       |       |
                 +-------+-------+---
                 |       |       |
                 |  a+c  |  b+d  |   t

true positives
a = positive testers who have disease
true negatives
d = negative testers who are without disease
false positives
b = positive testers who are without disease
false negatives
c = negative testers who have disease
prevalence
(a+c)/t = fraction of population that has disease
sensitivity
a/(a+c) = what fraction of those with disease test positive
specificity
d/(b+d) = what fraction of those without disease test negative
predictive value positive
a/(a+b) = what fraction of positive tests have disease
predictive value negative
d/(c+d) = what fraction of negative tests are without disease
accuracy
(a+d)/t = what fraction of tests are correct; depends on prevalence
diagnostic odds ratio
(a/b)/(c/d) = ad/bc = usefulness of test; independent of prevalence
Notes:

Information retrieval people know sensitivity as "recall" and predictive value positive as "precision."

Screening with a cheap test with high sensitivity then an expensive test with high specificity is often the best (most cost effective) strategy.

And, finally, from above where prevalence is the fraction of the population that has disease, we also define incidence as the rate per unit time at which new cases appear and duration as average length of the disease period in a given case of infection. The three measures are mutually redundant: Duration=(Prevalence/Incidence)



dan@geer.org