Histograms and Distribution Shapes
How to read the shape of a dataset — and why identical summary statistics can hide wildly different data.
In 1973, statistician Francis Anscombe constructed four small datasets. All four had nearly identical means, standard deviations, and correlations. By every standard summary statistic, they were the same dataset.
They looked nothing alike.
One was a clean linear relationship. One was a perfect curve. One was a tight line with a single extreme outlier. The fourth was nearly vertical. Summary statistics had concealed everything interesting about the data.
Anscombe's Quartet became a foundational argument for visualizing your data before summarizing it. The histogram is the first and most important tool for doing that.
What a histogram actually shows
A histogram plots the distribution of a single variable — how often different values appear across the full range of the data.
You divide the range into equal-width bins, count how many values fall into each bin, and draw a bar for each count. The result is a picture of the data's shape.
The shape carries information that no single number captures:
- Where is the bulk of the data?
- Is it symmetric or pulled in one direction?
- Is there one cluster or several?
- Are there unusual values far from the rest?
The four shapes you'll encounter constantly
Symmetric (bell-shaped): Values cluster around a central point and fall off evenly on both sides. Mean ≈ median. This is the normal distribution. Test scores, measurement errors, and many biological measurements look like this.
Right-skewed: A long tail stretches toward high values. The bulk of data is low, but a few extreme values pull the mean rightward. Income, house prices, and wait times are almost always right-skewed. The mean overstates what's typical.
Left-skewed: A long tail toward low values. Less common, but it appears in things like age at retirement or student completion rates in courses with a hard deadline. The mean understates what's typical.
Bimodal (two peaks): Two distinct clusters. This is often a sign that your data is actually two populations mixed together. Heights in a sample that includes both men and women often look bimodal. If you see two peaks, your first question should be: what two groups am I looking at?
Choosing bin width
A histogram's appearance changes dramatically with bin width, and there's no universally "correct" choice.
Too few bins (too wide): everything gets lumped together. The histogram looks like a rectangle. You can't see shape at all.
Too many bins (too narrow): every bin has 0 or 1 values. The histogram looks like random noise. You're seeing sampling variation, not the underlying distribution.
The right bin width lets the genuine shape emerge without creating false fine-grained structure. As a starting point, try around bins for data points — then adjust by eye. The goal is to see the signal, not the noise.
Symmetric distribution: mean and median agree, and either is a fair summary.
What to look for in practice
Before computing a single statistic, look at the histogram and ask:
- Is it symmetric? If yes, the mean is a fair summary. If no, prefer the median.
- Are there outliers? Isolated bars far from the main body can distort every downstream calculation.
- Is it unimodal? One peak means one population. Multiple peaks means you should investigate what's creating the clusters.
- What's the range? Does it make sense given what the data is supposed to measure? Impossible values (negative ages, test scores above 100) show up immediately in a histogram.
The formal structure
For a dataset with range divided into bins of equal width :
Each bar's height represents the count (or frequency) of observations falling into that bin. For a density histogram, the bar heights are scaled so the total area equals 1 — this makes histograms with different sample sizes directly comparable.
Key takeaways
- Summary statistics (mean, std dev) can be identical across very different datasets — always visualize first
- The four common shapes are: symmetric, right-skewed, left-skewed, bimodal
- Skew tells you whether to trust the mean or prefer the median
- Two peaks usually mean two populations — don't average across them
- Bin width is a choice; adjust until the genuine shape is visible
Standard Deviation and Variance
Why spread matters as much as the average, and how to measure it.
The Normal Distribution
Why so many things in nature cluster around a middle value — and how to read the bell curve.
Mean vs. Median
Why the average can be a misleading summary — and when to trust the middle value instead.
Enjoying this? Get notified when new concepts and articles launch.