Playground

Interactive companions to selected Concepts and Field Notes — not a substitute for them. Adjust the sliders, pressure-test the mental model, then follow the link back to the full explanation.

Statistics

Distribution Shape Explorer

Read the concept →

Switch between symmetric, skewed, and bimodal data. Slide the bin count and watch shape emerge — or vanish — while mean and median lines reveal the skew.

Distribution shape explorer
20 bins
5 bins50 bins
MeanMedian
Raw values
50.7
Mean
50.3
Median

Symmetric distribution: mean and median agree, and either is a fair summary.

Mean vs. Median Puller

Read the concept →

Drag the dots along a number line. Pull one far to the right and watch the mean chase it while the median barely moves.

Mean vs. median puller
63.9
Mean
61.0
Median
0255075100125150
MeanMedian

Drag the right-side dot far out. Watch which line chases it.

Variance Explorer

Read the concept →

Adjust the spread of a distribution and add outliers. Watch how mean, variance, and standard deviation respond in real time.

Variance explorer
50.6
Mean
85.6
Variance
9.25
Std dev
10
230
0
04

Confidence Interval Coverage

Read the concept →

Draw repeated samples and see how often the 95% confidence interval actually captures the true mean — and what happens when it misses.

Confidence interval simulator
Draw samples to see intervals accumulate
4447505356Draw samples to see intervals
Contains μ = 50Misses μ = 50True mean

Regression Fit & Residuals

Read the concept →

Control slope, noise, and sample size to see how a line fits data and where residuals come from.

Regression playground
0.82
Fitted slope
9.0
Intercept
0.404
1.0
-3.03.0
30
080
50
10100

Sample Means Converge to a Bell Curve

Read the concept →

Pick a source distribution — uniform, skewed, or bimodal. Collect sample means and watch them converge to a bell curve, regardless of what the source looks like.

Central limit theorem simulator
n = 10
n = 1n = 50
Source distribution
Sample means
Take samples to begin
0
Samples collected
0.913
Std error (σ/√n)

Null Distribution Explorer

Read the concept →

Drop an observed z-statistic and watch the tail shade. Then switch to p-curve mode and run 1000 null experiments — the p-values scatter uniformly, with 5% below 0.05 by chance alone.

Null distribution explorer
2.00
-4.004.00
2.00
|z|
0.0455
p (two-sided)
1 in 22
frequency

Uncommon under the null, but happens about 1 in 22 experiments by chance.

Effect Size Overlap

Read the concept →

Slide Cohen's d from 0 to 2 and watch two distributions pull apart. The shaded overlap shrinks while the chance a treated individual beats an untreated one climbs from 50% to 92%.

Effect Size Overlap
0.50
0.002.00
0.50
Cohen's d
64%
P(treated > ctrl)
80%
Visual overlap

Cohen's 'medium' (d ≈ 0.5). ~64% chance treated beats untreated — the curves visibly pull apart.

√n Shrinkage

Read the concept →

Slide the sample size on a log scale from 10 to 10,000. Watch the margin of error shrink as 1/√n — then flip to mode B and watch the same tiny effect become 'highly significant' on n alone.

√n Shrinkage
10
1010000
error ∝ 1/√n
10
Sample size
±31.0 pp
Margin of error (95%)
0.1581
Standard error

At n=10, the margin is enormous — your estimate barely constrains anything.

Chartist Fallacy

Read the concept →

Three financial charts. Pick the one with a real upward trend — then reveal that all three are zero-drift random walks. Dial drift up to feel where 'real' starts.

Chartist Fallacy
A
B
C
0.000
0.0000.150
Your pick
μ = 0.000
Drift per step

Three financial charts. Which one shows a real upward trend?

Trend or Noise

Read the concept →

Two panels. One holds a hidden upward trend; one is pure noise. Identify the trend across rounds — at small n, your accuracy will hover around 50%.

Trend or Noise
30
10200
Left
Right
0/0
Correct
Accuracy
n = 30
Points per panel

Two panels. One holds a hidden upward trend; one is pure noise. Identify the trend.

Multiple Testing on Pure Noise

Read the concept →

Run K significance tests on pure noise, then watch Bonferroni and Benjamini-Hochberg sweep the false positives away. At K=20 with no correction, expect about one 'hit' — even though nothing is real.

Jelly Bean Lab
20
1100
T10.627
T20.003
T30.527
T40.981
T50.968
T60.281
T70.613
T80.721
T90.426
T100.995
T110.455
T120.489
T130.139
T140.404
T150.248
T160.154
T170.489
T180.067
T190.395
T200.767
20
Tests run
1
Significant
1.0
Expected at p<0.05

Test K=20 colors at α=0.05 — about 1.0 'hit' expected from pure noise.

Probability

False Positives Under Low Prevalence

Read the concept →

Adjust test accuracy and disease prevalence. See how many of the positive results are actually false alarms — and why rare conditions are hard to screen for.

Base rate neglect
1%
0.5%5%
95%
80%99%
90%
80%99%
10,000 people · each square = 10 people
8.8%
of positive tests are real
(95 true positives, 990 false positives, among 1085 positive tests)
Has disease · test positiveNo disease · test positiveHas disease · test negativeNo disease · test negative

Posterior Probability by Scenario

Read the concept →

See why a test that flags 95% of true cases can still be wrong most of the time when the condition is rare — base rate and false-positive rate decide the answer. Switch between disease testing, spam filtering, and fraud detection.

Bayes visualizer

A disease affects 1% of people. The test correctly identifies 95% who have it, but also flags 10% of healthy people.

100 people
Tests positive, has conditionHas condition, not detectedFalse positiveHealthy, tests negative
1.0%
Prior
positive test
8.8%
Posterior
Out of 100,000 people tested:
  • 950 test positive and have the condition
  • 9,900 test positive but do not
Of 10,850 positive results, 8.8% actually have the condition.
1.0%
0.1%50.0%

Monty Hall: Should You Switch?

Read the concept →

Choose a door. The host reveals a goat. Should you switch? Run a thousand trials and let the numbers settle the argument.

Monty Hall simulator

Pick a door to start.

Causal Reasoning

When a Hidden Variable Drives Both

Read the concept →

Ice cream sales track drownings perfectly — until you recolor by season and the link vanishes. The DAG below names what you just saw: a third variable driving both.

Confounder DAG
r = 0.86
Pooled r
r = -0.08
Within-season r

More ice cream, more drownings — strong correlation. Sounds causal.

Correlation Failure Modes

Read the concept →

Four real correlations, each broken by a different mechanism. Click through and watch the taxonomy emerge: confounder, reverse causation, chance, selection bias.

Why a correlation lies
r = 0.85
Pearson r
Mechanism

Ice cream sales track drownings tightly — but does ice cream cause drowning?

ML Intuition

Overfitting Explorer

Read the concept →

Fit polynomials of increasing degree to noisy data. Watch the training error fall while the test error climbs — and see exactly where the model starts memorizing instead of learning.

Overfitting explorer
3 — Good fit
1 — Good fit10 — Good fit
Fitted curveTrue function (sin)Training points
0.0974
Training error
0.1071
Test error

Data Thinking

Survivorship Bias — Wald's Planes

Read the concept →

See the WWII bomber damage pattern that Abraham Wald used to prove we were reinforcing the wrong parts of the plane.

Survivorship bias — the Wald problem

Damage recorded on returning bombers. Engineers wanted to reinforce the areas with the most holes.

Returning planes (observed)

How Peeking Inflates False Positives

Read the field note →

Simulate 5,000 A/B tests under no real effect. Adjust how often you peek at the dashboard and watch the false positive rate climb from 5% to over 25%.

Peeking simulator

5,000 simulated A/B tests, no real effect — adjust how often you peek

10
130
10
Looks
Fixed-N FPR
Peek-and-stop FPR
The dashed line marks the 5% rate the math is supposed to guarantee.

Average vs. typical user lifetime

Read the field note →

Slide the power-user fraction from 0 to 30% and watch the mean lifetime climb 20× beyond the median. The 'average user' is the gap between these two numbers.

Average vs. typical user lifetime
0.15
0.000.30
MedianMean
3.5
Median lifetime
78
Mean lifetime
22.2×
Mean / median

15% power users → mean lives ~22× longer than the median. The "average user" is the gap.