Standard Deviation Calculator
Sample Std Dev (s) =
6 values · mean = 18
Show Your Work
All Summary Statistics
- Mean (x̄)
- 18
- Sample Variance (s²)
- 182
- Population Std Dev (σ)
- 12.315302
- Count (n)
- 6
- Sum (Σxᵢ)
- 108
- Sum of squared deviations
- 910
6 values · mean = 18
Use the sample formula when your data is a subset drawn from a larger population — the (n − 1) divisor (Bessel's correction) adjusts for the downward bias that creeps in when the sample mean stands in for the unknown true population mean. This is the default for almost all real-world reporting: lab replicates, survey responses, A/B test cohorts, and stock-return windows are all samples.
s = √( Σ(xᵢ − x̄)² / (n − 1) )
Use the population formula only when you have measured every member of the group with no sampling involved — e.g., the final exam scores of every student in a class, the diameters of every part in a finished batch, or any complete enumeration. Dividing by N (not n − 1) is correct because there is no unobserved population behind the data.
σ = √( Σ(xᵢ − x̄)² / N )
Standard deviation measures how far values typically sit from the mean. Subtract the mean from each value, square the deviation (so positive and negative gaps don't cancel), average those squared deviations to get the variance, then take the square root to return to the original units. The sample formula divides the sum of squared deviations by n − 1 instead of n because the sample mean is itself estimated from the same data — using n alone would systematically underestimate the true spread. That (n − 1) divisor is Bessel's correction, and it's the single most-asked-about detail in introductory statistics. The square root at the end converts variance (squared units) back into the same units as the inputs, which is why people quote standard deviation rather than variance when describing real-world spread.
Find the sample and population standard deviation of the classic dataset [4, 8, 15, 16, 23, 42] — six observations.
Note that sample SD is always larger than population SD for the same data — dividing by n − 1 instead of n inflates the result. The gap narrows quickly as n grows: for n = 100 the two differ by only about 0.5%, but for n = 5 the sample SD is roughly 12% larger than the population SD.
Three ideas are worth pinning down. First, sample vs population: in practice almost every dataset is a sample (you rarely measure an entire population), so the sample formula with (n − 1) is the default. Use the population formula only when you genuinely have the complete set. Second, the 68-95-99.7 rule (the empirical rule): for data that is approximately normally distributed, about 68% of values fall within ±1σ of the mean, about 95% within ±2σ, and about 99.7% within ±3σ. This is a quick mental check for normality and a fast way to spot outliers — any value more than 3σ from the mean is unusual under a normal model. Third, variance vs standard deviation: variance is the average squared deviation (units²); standard deviation is its square root (original units). Variance is the workhorse for statistical theory because it adds cleanly across independent random variables, but SD is the value people quote in reports because its units are interpretable.
Standard deviation measures how spread out a set of numbers is around their mean. A small SD means values cluster tightly near the average; a large SD means they're scattered widely. It's reported in the same units as the data (dollars, inches, °C, etc.), which makes it the most-quoted descriptive statistic for spread alongside the mean.
First compute the mean (sum ÷ count). Then subtract the mean from each value and square the result — this is the squared deviation. Sum all squared deviations, divide by either n (for a population) or n − 1 (for a sample) to get the variance, then take the square root. The square root is the standard deviation; the value before the square root is the variance.
Population SD (σ) divides the sum of squared deviations by N — the total population size — and assumes you've measured every member. Sample SD (s) divides by n − 1, where n is the sample size, and assumes the data is a subset drawn from a larger unknown population. In practice you almost always have a sample, so the sample formula is the default in research, business, and lab work.
The sample mean is itself estimated from the same data used to compute deviations, which makes the squared deviations slightly smaller than they would be against the true unknown population mean. Dividing by n − 1 (Bessel's correction) compensates for that downward bias and produces an unbiased estimator of the population variance. The intuition: with n data points, you've already 'used up' one degree of freedom estimating the mean, leaving only n − 1 independent deviations.
There is no universal good value — it depends entirely on context and on the mean. A SD of 0.1 °C is huge for body-temperature measurements but tiny for daily outdoor temperatures. The coefficient of variation (SD ÷ mean × 100%) is unit-free and lets you compare spread across different scales: CV under 10% is generally considered low, 10–30% moderate, and above 30% high for most natural processes.
Also called the empirical rule, it states that for data that is approximately normally distributed, about 68% of values fall within one standard deviation of the mean, about 95% within two, and about 99.7% within three. It's a quick mental shortcut for spotting outliers and judging how spread out data is — a value more than 3σ from the mean is unusual under a normal model.
Variance is the average squared deviation from the mean; standard deviation is the square root of the variance. They convey the same information about spread, but SD is in the original units of the data while variance is in squared units. Variance is the standard tool for statistical theory (it adds cleanly across independent variables); SD is what gets reported because the units are interpretable.
Standard deviation works well for roughly symmetric, single-peak distributions — most lab measurements, test scores, and physical processes. For skewed data (income, real estate prices, response times) or data with extreme outliers, the interquartile range (Q3 − Q1) and median are more robust because they ignore the tails. A rule of thumb: if the mean and median differ noticeably, prefer the IQR.
Reference: Standard formulas as defined in NIST/SEMATECH e-Handbook of Statistical Methods (§1.3.5.6, descriptive statistics) and Casella & Berger, Statistical Inference (2nd ed.), §7.3.1 (Bessel's correction). The empirical 68-95-99.7 rule is exact for a Gaussian distribution.
Standard deviation is the square root of variance. Both forms share the same numerator — the sum of squared deviations from the mean — and differ only in the divisor:
Where:
The shaded band shows the 68% of values that fall within one standard deviation of the mean under a normal distribution; the wider bracket covers ±2σ (≈95%), and the full curve to ±3σ covers ≈99.7%. This is the 68-95-99.7 empirical rule, the fastest way to translate a standard-deviation number into intuition about spread.
Lab Research — Replicate Measurements
A microbiology lab measures the optical density of five replicate culture plates: 0.412, 0.418, 0.405, 0.421, 0.409. Report the mean and sample standard deviation.
Mean ≈ 0.413 OD, s ≈ 0.0065 OD
Report as 0.413 ± 0.007 OD (1 SD). The five plates are a sample representing the underlying biological replicate distribution, so the sample formula with n − 1 is correct.
Finance — Monthly Return Volatility
A balanced index fund's monthly returns over six months (%): 1.8, −0.6, 2.4, 0.5, −1.2, 1.1. Find the mean return and sample standard deviation as a volatility proxy.
Mean ≈ 0.67%, monthly volatility s ≈ 1.42%
Annualize by multiplying by √12 ≈ 3.46, giving annual volatility ≈ 4.93%. Most managed funds report annualized volatility this way for cross-fund comparison.
Quality Control — Complete-Batch Inspection
Every part in a batch of eight precision washers is measured for thickness (mm): 2.51, 2.49, 2.50, 2.52, 2.48, 2.50, 2.51, 2.49. Because every washer in the batch was measured, treat it as a complete population.
Mean = 2.50 mm, σ ≈ 0.012 mm
For a ±0.05 mm tolerance, the process capability index Cpk = 0.05 / (3σ) ≈ 1.37 — comfortably above the 1.33 threshold considered capable. Use the population formula here because every washer in the batch was measured; there's no larger unobserved population.