Confidence Intervals

What '95% confident' actually means — and why the most common interpretation is precisely backwards.

Before this:The Central Limit Theorem Sampling

The most misread number in statistics

A 2016 poll shows a candidate at 51%, with a margin of error of ±3 points. The news anchor says: "There's a 95% chance the candidate is between 48% and 54%."

This is wrong. And the correct interpretation is subtle enough that even researchers get it backwards.

A 95% confidence interval does not mean there is a 95% chance the true value lies inside this particular interval. Once you've collected your data and computed the interval, the true value either is or isn't inside it. There's no probability left — just a fact you don't know.

What 95% actually describes is the procedure, not this interval.

What the confidence really refers to

Here's the correct interpretation: if you repeated your sampling procedure 100 times — each time collecting a new sample and computing an interval using the same method — approximately 95 of those intervals would contain the true population value.

The 95% is a property of the method, applied over many hypothetical repetitions. It describes how often the procedure succeeds, not the probability that any single interval is correct.

Think of it like a fishing net. A 95% confidence interval is a net that, when cast correctly, catches the fish 95% of the time. Once you've cast it and it's either caught the fish or it hasn't — the probability was always 0 or 1. You just don't know which.

Confidence interval simulator

Draw samples to see intervals accumulate

Contains μ = 50Misses μ = 50True mean

Where it comes from

The Central Limit Theorem tells us that the sample mean $\bar{x}$ is normally distributed around the true population mean $\mu$ , with standard error $\text{SE} = \sigma / \sqrt{n}$ .

For a 95% confidence interval, we use the fact that 95% of a normal distribution falls within 1.96 standard deviations of the center:

95% confidence interval for the mean

$\bar{x} \pm 1.96 \cdot \frac{\sigma}{\sqrt{n}}$

In practice, we don't know $\sigma$ (the population standard deviation), so we estimate it from the sample and use a $t$ -distribution instead of the normal. For large samples, the difference is negligible.

The width of the interval depends on two things: the confidence level you choose (wider interval for higher confidence) and the sample size (larger samples produce narrower intervals).

A worked example

You measure the blood pressure of 100 patients after administering a drug. The average reduction is 8 mmHg, with a sample standard deviation of 20 mmHg.

The standard error is $20 / \sqrt{100} = 2$ mmHg.

The 95% confidence interval is: $8 \pm 1.96 \times 2 = 8 \pm 3.92$ , or roughly [4.1, 11.9] mmHg.

What this tells you: the method that produced this interval will bracket the true mean effect about 95% of the time. This particular interval is one of those — and you don't know if it's one of the lucky 95 or one of the unlucky 5.

What this does not tell you: that the probability the true effect is between 4.1 and 11.9 is 95%. The true effect has a fixed (unknown) value. Your interval either contains it or it doesn't.

Why it matters practically

The election example. When a poll shows a candidate at 48%–54% with 95% confidence, it doesn't mean "probably in this range." It means the polling method produces correct intervals 95% of the time. This specific interval might be one of the 5% that misses the true value. Interpreting it as a probability gives you false certainty.

The replication problem. Researchers sometimes think: if my 95% CI excludes zero, there's only a 5% chance the effect is zero. This isn't right either. The 5% refers to long-run error rates across many experiments, not the probability that this particular null hypothesis is true. P-values and confidence intervals quantify the behavior of procedures, not the truth of specific claims.

Wider vs. narrower. A wider CI isn't necessarily bad — it's honest about uncertainty. A very narrow CI achieved with a small sample should make you suspicious, not confident. The width only shrinks legitimately by collecting more data.

Changing the confidence level

A 99% CI is wider than a 95% CI. A 90% CI is narrower. You're trading off two things: how often you're right (the confidence level) and how precise your estimate is (the interval width).

General confidence interval

$\bar{x} \pm z^* \cdot \frac{\sigma}{\sqrt{n}}$

Where $z^*$ is the critical value from the normal distribution: 1.645 for 90%, 1.96 for 95%, 2.576 for 99%.

The confidence level is a design choice. There is nothing magical about 95% — it's a convention. Some fields use 99% as their standard; some use 90%. What matters is understanding what the number actually promises, and what it doesn't.

Explore in Playground →

Continue exploring

applied·Interactive

The Central Limit Theorem

Why averages of random samples tend toward a normal distribution — and how this single fact makes all of classical statistics possible.

applied·Interactive

P-Values

The most cited number in science is also the most misunderstood. Here is what p < 0.05 actually means — and what it doesn't.

foundational·Interactive

Sampling

How a poll of 1,000 people can reliably represent 330 million — and why it breaks the moment sampling stops being random.

Enjoying this? Get notified when new concepts and articles launch.