Analysis of covariance
When to use it
Analysis of covariance (ancova) is used when you have two measurement variables and two nominal variables. One of the nominal variables groups is the "hidden" nominal variable that groups the measurement observations into pairs, and the other nominal variable divides the regressions into two or more sets.
The purpose of ancova to compare two or more linear regression lines. It is a way of comparing the Y variable among groups while statistically controlling for variation in Y caused by variation in the X variable. For example, let's say you want to know whether the Cope's gray treefrog, Hyla chrysoscelis, has a different calling rate than the eastern gray treefrog, Hyla versicolor, which has twice as many chromosomes as H. chrysoscelis but is morphologically identical. As shown on the regression web page, the calling rate of eastern gray treefrogs is correlated with temperature, so you need to control for that. One way to control for temperature would be to bring the two species of frogs into a lab and keep them all at the same temperature, but you'd have to worry about whether their behavior in an artificial lab environment was really the same as in nature. Instead, it would be better to measure the calling rate of each species of frog at a variety of temperatures, then use ancova to see whether the regression line of calling rate on temperature is significantly different between the two species.
Null hypotheses
Two null hypotheses are tested in an ancova. The first is that the slopes of the regression lines are all the same. If this hypothesis is not rejected, the second null hypothesis is tested: that the Y-intercepts of the regression lines are all the same.
Although the most common use of ancova is for comparing two regression lines, it is possible to compare three or more regressions. If their slopes are all the same, it is then possible to do planned or unplanned comparisions of Y-intercepts, similar to the planned or unplanned comparisons of means in an anova. I won't cover that here.
How it works
The first step in performing an ancova is to compute each regression line. In the frog example, there are two values of the species nominal variable, Hyla chrysoscelis and H. versicolor, so the regression line is calculated for calling rate vs. temperature for each species of frog.
Next, the slopes of the regression lines are compared; the null hypothesis that the slopes are the same is tested. The final step of the anova, comparing the Y-intercepts, cannot be performed if the slopes are significantly different from each other. If the slopes of the regression lines are different, the lines cross each other somewhere, and one group has higher Y values in one part of the graph and lower Y values in another part of the graph. (If the slopes are different, there are techniques for testing the null hypothesis that the regression lines have the same Y-value for a particular X-value, but they're not used very often and I won't consider them here.)
If the slopes are significantly different, the ancova is done, and all you can say is that the slopes are significantly different. If the slopes are not significantly different, the next step in an ancova is to draw a regression line through each group of points, all with the same slope. This common slope is a weighted average of the slopes of the different groups.
The final test in the ancova is to test the null hypothesis that all of the Y-intercepts of the regression lines with a common slope are the same. Because the lines are parallel, saying that they are significantly different at one point (the Y-intercept) means that the lines are different at any point.
Examples
![]() |
| Eggs laid vs. female weight in the firefly Photinus ignitus.. Filled circles are females that have mated with three males; open circles are females that have mated with one male. |
In the firefly species Photinus ignitus, the male transfers a large spermatophore to the female during mating. Rooney and Lewis (2002) wanted to know whether the extra resources from this "nuptial gift" enable the female to produce more offspring. They collected 40 virgin females and mated 20 of them to one male and 20 to three males. They then counted the number of eggs each female laid. Because fecundity varies with the size of the female, they analyzed the data using ancova, with female weight (before mating) as the independent measurement variable and number of eggs laid as the dependent measurement variable. Because the number of males has only two values ("one" or "three"), it is a nominal variable, not measurement.
The slopes of the two regression lines (one for single-mated females and one for triple-mated females) are not significantly different (F1, 36=1.1, P=0.30). The Y-intercepts are significantly different (F1, 36=8.8, P=0.005); females that have mated three times have significantly more offspring than females mated once.
Paleontologists would like to be able to determine the sex of dinosaurs from their fossilized bones. To see whether this is feasible, Prieto-Marquez et al. (2007) measured several characters that are thought to distinguish the sexes in alligators (Alligator mississipiensis), which are among the closest living relatives of dinosaurs. One of the characters was pelvic canal width, which they wanted to standardize using snout-vent length. The raw data are shown in the SAS example below.
The slopes of the regression lines are not significantly different (P=0.9101). The Y-intercepts are significantly different (P=0.0267), indicating that male alligators of a given length have a significantly greater pelvic canal width. However, inspection of the graph shows that there is a lot of overlap between the sexes even after standardizing for sex, so it would not be possible to reliably determine the sex of a single individual with this character alone.
![]() |
| Pelvic canal width vs. snout-vent length in the American alligator. Blue circles and line are males; pink X's and line are females. |
![]() |
| Pelvic canal width vs. snout-vent length in the American alligator. Circles and solid line are males; X's and dashed line are females. |
Graphing the results
Data for an ancova are shown on a scattergraph, with the independent variable on the X-axis and the dependent variable on the Y-axis. A different symbol is used for each value of the nominal variable, as in the firefly graph above, where filled circles are used for the thrice-mated females and open circles are used for the once-mated females. To get this kind of graph in a spreadsheet, you would put all of the X-values in column A, one set of Y-values in column B, the next set of Y-values in column C, and so on.
Most people plot the individual regression lines for each set of points, as shown in the firefly graph, even if the slopes are not significantly different. This lets people see how similar or different the slopes look. This is easy to do in a spreadsheet; just click on one of the symbols and choose "Add Trendline" from the Chart menu.
Similar tests
One alternative technique that is sometimes possible is to take the ratio of the two measurement variables, then use a one-way anova. For the mussel example I used for testing the homogeneity of means in one-way anova, I standardized the length of the anterior adductor muscle by dividing by the total length. There are technical problems with doing statistics on ratios of two measurement variables (the ratio of two normally distributed variables is not normally distributed), but if you can safely assume that the regression lines all pass through the origin (in this case, that a mussel that was 0 mm long would have an AAM length of 0 mm), this is not an unreasonable thing to do, and it simplifies the statistics. It would be important to graph the association between the variables and analyze it with linear regression to make sure that the relationship is linear and does pass through the origin.
Sometimes the two measurement variables are just the same variable measured at different times or places. For example, if you measured the weights of two groups of individuals, put some on a new weight-loss diet and the others on a control diet, then weighed them again a year later, you could treat the difference between final and initial weights as a single variable, and compare the mean weight loss for the control group to the mean weight loss of the diet group using a one-way anova. The alternative would be to treat final and initial weights as two different variables and analyze using an ancova: you would compare the regression line of final weight vs. initial weight for the control group to the regression line for the diet group. The anova would be simpler, and probably perfectly adequate; the ancova might be better, particularly if you had a wide range of initial weights, because it would allow you to see whether the change in weight depended on the initial weight.
One nonparametric alternative to ancova is to convert the measurement variables to ranks, then do a regular ancova on the ranks; see Conover and Iman (1982) for the details. There are several other versions of nonparametric ancova, but they appear to be less popular, and I don't know the advantages and disadvantages of each.
How to do the test
Spreadsheet and web pages
Richard Lowry has made web pages that allow you to perform ancova with two, three or four groups, and a downloadable spreadsheet for ancova with more than four groups. You may cut and paste data from a spreadsheet to the web pages. One bug in the web pages is that very small values of P are not represented correctly. If the web page gives you a P value greater than 1, use the FDIST function of Excel along with the F value and degrees of freedom from the web page to calculate the correct P value.
SAS
Here's an illustration of how to do analysis of covariance in SAS, using the data from Prieto-Marquez et al. (2007) on snout-vent length and pelvic canal width in alligators:
data gators; input sex $ snoutvent pelvicwidth; cards; male 1.10 7.62
====See the web page for the full data set====
female 1.23 9.23 ; proc glm data=gators; class sex; model pelvicwidth=snoutvent sex snoutvent*sex; proc glm data=gators; class sex; model pelvicwidth=snoutvent sex; run;
The first time you run PROC GLM, the MODEL statement includes the interaction term (SNOUTVENT*SEX). This tests whether the slopes of the regression lines are significantly different:
Type III Mean
Source DF SS Square F Value Pr > F
snoutvent 1 33.949 33.949 88.05 <.0001
sex 1 0.079 0.079 0.21 0.6537
snoutvent*sex 1 0.005 0.005 0.01 0.9101 slope P-value
If the P-value of the slopes is significant, you'd be done. In this case it isn't, so you look at the output from the second run of PROC GLM. This time, the MODEL statement doesn't include the interaction term, so the model assumes that the slopes of the regression lines are equal. This P-value tells you whether the Y-intercepts are significantly different:
Type III Mean
Source DF SS Square F Value Pr > F
snoutvent 1 41.388 41.388 110.76 <.0001
sex 1 2.016 2.016 5.39 0.0267 intercept P-value
Further reading
Sokal and Rohlf, pp. 499-521.
References
Conover, W.J., and R.L. Iman. Analysis of covariance using the rank transformation. Biometrics 38: 715-724.
Prieto-Marquez, A., P.M. Gignac, and S. Joshi. 2007. Neontological evaluation of pelvic skeletal attributes purported to reflect sex in extinct non-avian archosaurs. J. Vert. Paleontol. 27: 603-609.
Rooney, J., and S.M. Lewis. 2002. Fitness advantage from nuptial gifts in female fireflies. Ecol. Entom. 27: 373-377.
⇐ Previous topic | Next topic ⇒
This page was last revised August 13, 2008. Its address is http://udel.edu/~mcdonald/statancova.html. It may be cited as pp. 211-216 in:
McDonald, J.H. 2008. Handbook of Biological Statistics. Sparky House Publishing, Baltimore, Maryland.
©2008 by John H. McDonald. You can probably do what you want with this content; see the permissions page for details.



