About

Quantitative intuition for people who work with data.

DataSense is a reading and thinking tool for people who work with data — analysts, engineers, researchers, and decision-makers who want the conceptual depth their training skipped.

Most statistics resources start with equations. DataSense starts with intuition: a surprising question, a visual, an interactive experiment. The math follows once you have the right mental model to attach it to.

The content covers statistics, probability, data thinking, and machine learning intuition. No prior mathematics degree required — curiosity is enough.

The problem

Quantitative reasoning is one of the most practically useful skills a person can develop, and one of the most poorly taught. The online landscape offers two options: shallow pop-science summaries that leave you with nothing durable, or university-level textbooks that assume prior mathematical fluency most people don't have.

The gap between them is enormous. A data analyst who can run a regression but has no feel for what a confidence interval actually means. A manager who approves A/B tests without understanding what p < 0.05 does and doesn't guarantee. A journalist who reports a relative risk without knowing the base rate. These are not edge cases.

DataSense is aimed at that gap.

The approach

Every concept on DataSense follows the same sequence:

A surprising question — something that reveals the concept is more interesting or more subtle than it appears.
A visual — a chart or diagram that makes the idea concrete before any terminology is introduced.
An interactive experiment — a simulation you can manipulate to see the concept respond in real time.
The formal math — which appears only once the reader has the right mental model to attach it to.

If a page leads with formulas, the page is wrong. The math is not the destination — it is the notation for an idea the reader already understands.

Six threads that run through everything

DataSense is not a list of topics. Six load-bearing mental models recur across statistics, probability, causal reasoning, and machine learning. Every concept is an instance of at least one thread.

Signal vs. Noise

Is this pattern real, or is it random?

The deepest thread in the curriculum. Almost everything in statistics is a different angle on this question — variance, the central limit theorem, p-values, confidence intervals, overfitting, effect size.

Reference Class Reasoning

Compared to what category of similar events?

The question is always: what is the right base rate? What have things like this done before? Runs through Bayes' theorem, base rate neglect, survivorship bias, regression to the mean.

The Causation Ladder

Association → Intervention → Counterfactual.

Knowing that X and Y move together is the bottom rung. Knowing that changing X changes Y is the middle. Knowing what would have happened without the change is the top. Most data analysis lives on the bottom rung and pretends otherwise.

Uncertainty Quantification

How do you put numbers on what you don't know?

Not eliminating uncertainty, but characterising it honestly. Probability, confidence intervals, prediction intervals, Bayesian updating — all tools for thinking clearly about incomplete information.

The Optimization Trap

When optimizing for a proxy loses the thing you actually wanted.

Goodhart's Law. Overfitting. P-hacking. Multiple comparisons. Survivorship bias in fund returns. The same failure mode appears across domains — measure the wrong thing long enough and the measure becomes the goal.

Tails and Extremes

What happens at the edges of distributions?

Normal distribution assumptions fail at the worst possible moments. Fat tails, extreme value statistics, random walks, black swan events — the cases where the standard models give confidently wrong answers.

What you'll find here

Field Notes — real-world case studies showing how working analysts and engineers get the math wrong, and the intuition that catches it.
Concepts — structured explainers that build from intuition to formalism, one idea at a time.
Playground — standalone interactive tools you can manipulate to see concepts in motion.
Paths — curated sequences for readers who prefer learning systematically.

What's in scope

Statistics

Distributions, inference, uncertainty, estimation

Probability

Chance, conditional reasoning, rare events

Causal Reasoning

Confounding, experiments, counterfactuals

Data Thinking

Bias, fallacies, how to read evidence

ML Intuition

How models learn, overfit, and generalise

Quantitative Finance

Volatility, compounding, risk, simulation

Deliberately out of scope: proofs and formal derivations, programming and implementation, specific software tools (R, Python, SQL), measure theory. The line is simple — if it helps someone reason better, it is in scope. If it helps someone implement better, it belongs elsewhere.

Have a concept suggestion or spotted an error? Open an issue on GitHub.