# Standard error of the mean

When you take a sample of observations from a population, the mean of the sample is an estimate of the parametric mean, or mean of all of the observations in the population. If your sample size is small, your estimate of the mean won't be as good as an estimate based on a larger sample size. Here are 10 random samples from a simulated data set with a true (parametric) mean of 5. The X's represent the individual observations, the red circles are the sample means, and the blue line is the parametric mean.

 Individual observations (X's) and means (red dots) for random samples from a population with a parametric mean of 5 (horizontal line).
 Individual observations (X's) and means (circles) for random samples from a population with a parametric mean of 5 (horizontal line).

As you can see, with a sample size of only 3, some of the sample means aren't very close to the parametric mean. The first sample happened to be three observations that were all greater than 5, so the sample mean is too high. The second sample has three observations that were less than 5, so the sample mean is too low. With 20 observations per sample, the sample means are generally closer to the parametric mean.

You'd often like to give some indication of how close your sample mean is likely to be to the parametric mean. One way to do this is with the standard error of the mean. If you take many random samples from a population, the standard error of the mean is the standard deviation of the different sample means. About two-thirds (68.3%) of the sample means would be within one standard error of the parametric mean, 95.4% would be within two standard errors, and almost all (99.7%) would be within three standard errors.

 Means of 100 random samples (N=3) from a population with a parametric mean of 5 (horizontal line).
 Means of 100 random samples (N=3) from a population with a parametric mean of 5 (horizontal line).

Here's a figure illustrating this. I took 100 samples of 3 from a population with a parametric mean of 5 (shown by the blue line). The standard deviation of the 100 means was 0.63. Of the 100 sample means, 70 are between 4.37 and 5.63 (the parametric mean ±one standard error).

Usually you won't have multiple samples to use in making multiple estimates of the mean. Fortunately, it is possible to estimate the standard error of the mean using the sample size and standard deviation of a single sample of observations. The standard error of the mean is estimated by the standard deviation of the observations divided by the square root of the sample size. For some reason, there's no spreadsheet function for standard error, so you can use STDEV(Ys)/SQRT(COUNT(Ys)), where Ys is the range of cells containing your data.

This figure is the same as the one above, only this time I've added error bars indicating ±1 standard error. Because the estimate of the standard error is based on only three observations, it varies a lot from sample to sample.

 Means ±1 standard error of 100 random samples (n=3) from a population with a parametric mean of 5 (horizontal line).
 Means ±1 standard error of 100 random samples (n=3) from a population with a parametric mean of 5 (horizontal line).

With a sample size of 20, each estimate of the standard error is more accurate. Of the 100 samples in the graph below, 68 include the parametric mean within ±1 standard error of the sample mean.

 Means ±1 standard error of 100 random samples (N=20) from a population with a parametric mean of 5 (horizontal line).
 Means ±1 standard error of 100 random samples (N=20) from a population with a parametric mean of 5 (horizontal line).

As you increase your sample size, sample standard deviation will fluctuate, but it will not consistently increase or decrease. It will become a more accurate estimate of the parametric standard deviation of the population. In contrast, the standard error of the means will become smaller as the sample size increases. With bigger sample sizes, the sample mean becomes a more accurate estimate of the parametric mean, so the standard error of the mean becomes smaller.

"Standard error of the mean" and "standard deviation of the mean" are equivalent terms. "Standard error of the mean" is generally used to avoid confusion with the standard deviation of observations. Sometimes "standard error" is used by itself; this almost certainly indicates the standard error of the mean, but because there are also statistics for standard error of the variance, standard error of the median, etc., you should specify standard error of the mean.

### Similar statistics

Confidence intervals and standard error of the mean serve the same purpose, to express the reliability of an estimate of the mean. In some publications, vertical error bars on data points represent the standard error of the mean, while in other publications they represent 95% confidence intervals. I prefer 95% confidence intervals. When I see a graph with a bunch of points and vertical bars representing means and confidence intervals, I know that most (95%) of the vertical bars include the parametric means. When the vertical bars are standard errors of the mean, only about two-thirds of the bars are expected to include the parametric means; I have to mentally double the bars to get the approximate size of the 95% confidence interval. In addition, for very small sample sizes, the 95% confidence interval is larger than twice the standard error, and the correction factor is even more difficult to do in your head. Whichever statistic you decide to use, be sure to make it clear what the error bars on your graphs represent. I have seen lots of graphs in scientific journals that gave no clue about what the error bars represent, which makes them pretty useless.

Standard deviation and coefficient of variation are used to show how much variation there is among individual observations, while standard error or confidence intervals are used to show how good your estimate of the mean is. The only time you would report standard deviation or coefficient of variation would be if you're actually interested in the amount of variation. For example, if you grew a bunch of soybean plants with two different kinds of fertilizer, your main interest would probably be whether the yield of soybeans was different, so you'd report the mean yield ± either standard error or confidence intervals. If you were going to do artificial selection on the soybeans to breed for better yield, you might be interested in which treatment had the greatest variation (making it easier to pick the fastest-growing soybeans), so then you'd report the standard deviation or coefficient of variation.

There's no point in reporting both standard error of the mean and standard deviation. As long as you report one of them, plus the sample size (N), anyone who needs to can calculate the other one.

### Example

The standard error of the mean for the blacknose dace data from the central tendency web page is 10.70.

### How to calculate the standard error

The descriptive statistics spreadsheet calculates the standard error of the mean for up to 1000 observations, using the function =STDEV(Ys)/SQRT(COUNT(Ys)).

#### Web pages

Web pages that will calculate standard error of the mean are here, here, and here.

#### SAS

PROC UNIVARIATE will calculate the standard error of the mean. For examples, see the central tendency web page.

Sokal and Rohlf, pp. 127-136.

Zar, pp. 76-79.

Return to the Biological Data Analysis syllabus