Variance Calculator
Sample Variance (s²) =
Sample SD (s) = 13.490738 · n = 6 · mean = 18
Sample SD (s) = 13.490738 · n = 6 · mean = 18
Use sample variance when your data is a random subset of a larger population and you want to estimate the unknown population variance. Dividing by n − 1 (Bessel's correction) compensates for the downward bias that arises from using the sample mean in place of the true mean.
s² = Σ(xᵢ − x̄)² / (n − 1)
Use population variance when you have measured every member of the group — no estimation, no Bessel correction. The divisor is N, the full count of items, and the result is the exact average squared deviation of the population.
σ² = Σ(xᵢ − x̄)² / N
Variance measures how spread out a set of numbers is around their mean. Compute it in three steps: (1) find the mean x̄ by adding the values and dividing by n; (2) compute each value's deviation from the mean and square it — squaring removes the sign so positives and negatives can't cancel, and it amplifies large deviations more than small ones; (3) average those squared deviations. The average uses divisor n for population variance (σ², full data set) or n − 1 for sample variance (s², a random subset estimating a larger population). The squaring is what makes variance distinct from mean absolute deviation: it makes the math analytically tractable (derivatives work cleanly) and it weights outliers more heavily, which matches the behavior most scientific and financial applications want.
Find the sample variance of the data set {4, 8, 15, 16, 23, 42}. This is the canonical 'Lost' numbers example and a textbook Wolfram Alpha verification case.
Sample variance is always larger than population variance for the same data because the divisor (n − 1) is smaller. The gap shrinks as n grows — by n = 100 the two differ by about 1 %, by n = 1000 by 0.1 %.
Three distinctions are worth keeping straight. First, units: variance is in the squared units of the input. If your data is in meters, the variance is in m². That's the main reason standard deviation (the square root of variance) is reported in practice — it's in the same units as the data. Second, sample vs population: divide by N when your data is the full population, divide by n − 1 (Bessel's correction) when your data is a random sample meant to estimate a larger population. The Bessel correction exists because using the sample mean x̄ in place of the unknown true mean μ systematically underestimates the spread; dividing by n − 1 instead of n unbiases the estimator. Third, variance vs standard deviation: variance is the squared average deviation; SD is its square root. SD is reported when communicating spread to humans (same units as data); variance shows up in the math underneath (ANOVA, regression, error propagation, portfolio theory) because squared quantities add cleanly while square roots do not.
Variance is a statistical measure of how spread out a set of numbers is around their mean. It's defined as the average of the squared deviations from the mean — large variance means values are far from the mean on average, small variance means they cluster tightly around it. Variance is always non-negative and is zero only when every value equals the mean.
First compute the mean by adding the values and dividing by the count. Then for each value, subtract the mean and square the result. Sum those squared deviations and divide — by n for population variance (σ²) or by n − 1 for sample variance (s²). The square root of the variance is the standard deviation.
Standard deviation is the square root of variance. They measure the same thing — spread around the mean — but in different units. If your data is in meters, the variance is in m² (meters squared) and the standard deviation is in meters. Standard deviation is preferred when communicating spread to humans because it's in the original units; variance shows up in the math because squared quantities add cleanly across independent sources.
Because every deviation is squared before averaging. Squaring turns negative deviations into positive numbers, so the sum can never go below zero. Variance is exactly zero only when every value in the data set equals the mean — i.e., the data has no spread at all.
Use sample variance (divide by n − 1) when your data is a random subset meant to estimate the spread of a larger population. Use population variance (divide by N) when your data is the entire population and you don't need to estimate anything beyond it. The n − 1 divisor — called Bessel's correction — compensates for the bias of using the sample mean in place of the unknown true mean. For large n the two values are nearly identical; for small n the difference matters.
Variance is in the squared units of the input data. If the data is in meters, the variance is in m² (meters squared). If the data is in dollars, the variance is in dollars squared. This is one reason standard deviation (the square root of variance) is more commonly reported — it shares the same units as the original data, which makes it easier to interpret.
Standard deviation is the positive square root of variance: σ = √σ² and s = √s². If you know one, you know the other. Variance is the primitive quantity in the math (it adds cleanly for independent variables, partitions naturally in ANOVA, and has clean derivatives for optimization); standard deviation is the human-friendly quantity for reporting because it's in the original units.
Two reasons. Mathematically, squaring is differentiable everywhere while the absolute value isn't — which matters for least-squares regression, gradient methods, and most of mathematical statistics. Statistically, squaring weights large deviations more than small ones, which matches what scientific and financial applications usually want: an outlier should affect a spread measure more than a near-mean point. The version that averages absolute deviations is called the mean absolute deviation (MAD); it's used occasionally but variance and standard deviation dominate practice.
Reference: Standard textbook definitions per the NIST/SEMATECH e-Handbook of Statistical Methods (§1.3.5.1 Measures of Location and Variability) and Wolfram MathWorld's articles on Variance and Sample Variance. Canonical example {4, 8, 15, 16, 23, 42} verified against Wolfram Alpha (sample variance = 182, population variance = 910/6).
Variance is the average squared deviation from the mean. Two forms exist depending on whether your data is a sample or the full population:
Where:
Squaring the deviations is what makes variance distinct from a simple average: it forces every contribution to be non-negative (so positives and negatives can't cancel) and it amplifies outliers more than near-mean points. The result is always in the squared units of the input — m² if your data is in meters, dollars² if it's in dollars — which is the main reason standard deviation (the square root of variance) is what most reports show to humans.
Statistics — Classroom Test Scores
A small class scores 72, 78, 85, 88, 92, 95 on a midterm. Treat the class as a sample of all possible students and find the sample variance.
Sample variance = 75.2 points², SD ≈ 8.67 points.
The variance is in points² — squared test points. The standard deviation, in plain points, is what you would report on a grade-distribution summary.
Finance — Monthly Stock Returns
A stock's monthly returns over six months (%) are: 2.1, −1.3, 3.4, 0.8, −0.5, 1.9. Treat this six-month window as the complete population for the period and compute the population variance.
Population variance ≈ 2.62 %², SD ≈ 1.62 %.
Annualized monthly variance ≈ 2.62 × 12 ≈ 31.5 %² → annualized SD ≈ 5.6 %. Variance scales with the time horizon; standard deviation scales with the square root of time.
Quality Control — CNC Machined Parts
A CNC machine cuts shafts with a target diameter of 25.00 mm. Eight consecutive parts measure (mm): 24.98, 25.02, 25.00, 24.99, 25.01, 25.00, 24.97, 25.03. Treat these eight as the complete population for this run and compute the population variance.
Population variance = 0.00035 mm², SD ≈ 0.0187 mm.
For a tolerance of ±0.05 mm, Cpk ≈ 0.05 / (3σ) ≈ 0.89 — borderline capable. A higher Cpk requires tightening the process to reduce variance.