applied·Interactive
ProbabilityStatistics

Bayes' Theorem

How a positive test result can still be mostly wrong — and how to update beliefs correctly when evidence arrives.

Before this:Base Rate Neglect

The question that breaks most people's intuition

A disease affects 1% of the population. A test for it correctly identifies 95% of people who have the disease (its sensitivity), and incorrectly flags 10% of people who don't (its false positive rate).

You test positive. What is the probability you actually have the disease?

Most people guess somewhere around 90%. The correct answer is about 8.8%.

Bayes visualizer

A disease affects 1% of people. The test correctly identifies 95% who have it, but also flags 10% of healthy people.

100 people
Tests positive, has conditionHas condition, not detectedFalse positiveHealthy, tests negative
1.0%
Prior
positive test
8.8%
Posterior
Out of 100,000 people tested:
  • 950 test positive and have the condition
  • 9,900 test positive but do not
Of 10,850 positive results, 8.8% actually have the condition.
1.0%
0.1%50.0%

Switch between scenarios above. The key insight is the same in every case: a positive result sounds alarming, but whether it actually means something depends on how rare the condition is to begin with.

Why the answer is so low

Imagine 1,000 people tested:

  • 10 have the disease (1% prevalence)
  • Of those 10: about 9.5 test positive (true positives)
  • 990 don't have the disease
  • Of those 990: about 99 also test positive (false positives — 10% of 990)

Total positive results: roughly 108. Of those, only 9.5 actually have the disease.

That's 9.5 out of 108 — about 8.8%.

The formula

Bayes' theorem gives us the exact calculation. We want P(disease | positive test) — the probability of having the disease given that you tested positive:

Bayes' theorem

P(AB)=P(BA)P(A)P(B)P(A \mid B) = \frac{P(B \mid A) \cdot P(A)}{P(B)}

In our disease example:

  • P(A)P(A) = prior probability of disease = 0.01
  • P(BA)P(B \mid A) = probability of positive test given disease = 0.95 (sensitivity)
  • P(B)P(B) = total probability of a positive test = 0.95×0.01+0.10×0.99=0.10850.95 \times 0.01 + 0.10 \times 0.99 = 0.1085

P(diseasepositive)=0.95×0.010.10850.088P(\text{disease} \mid \text{positive}) = \frac{0.95 \times 0.01}{0.1085} \approx 0.088

Prior, likelihood, and posterior

The three moving parts of Bayes' theorem:

  • Prior — your belief before seeing any evidence. In this case, the 1% base rate.
  • Likelihood — how probable the evidence is if the hypothesis is true. The 95% sensitivity.
  • Posterior — your updated belief after the evidence. The ~8.8% we calculated.

Bayesian reasoning is a machine for updating beliefs. You start with a prior, observe evidence, and compute a posterior. That posterior becomes the prior for the next piece of evidence.

Real-world consequences

  • Medical screening: population-wide screening for rare diseases produces many false positives. This is why mass screening for low-prevalence conditions is often a bad idea without confirmatory testing.
  • Spam filters: spam is common (high prior), so even a modest classifier performs well. The math works in its favor.
  • Fraud detection: fraud is rare (low prior), so flags trigger investigations rather than automatic blocks — because even a good detector will be wrong most of the time.
  • Legal reasoning: "the DNA test is 99.9% accurate" is not the same as "there's a 99.9% chance the defendant is guilty." The prior matters.

Enjoying this? Get notified when new concepts and articles launch.