One of the first steps in performing a one-way anova is deciding whether to do a Model I or Model II anova. The test of homogeneity of means is the same for both models, but the choice of models determines what you do if the means are significantly heterogeneous.
In a model I anova (also known as a fixed-effects model anova), the groups are identified by some characteristic that is repeatable and interesting. If there is a difference among the group means and you repeat the experiment, you would expect to see the same pattern of differences among the means, because you could classify the observations into the same groups. The group labels are meaningful (such as "seawater, glucose solution, mannose solution"). You are interested in the relationship between the way the data are grouped (the "treatments") and the group means. Examples of data for which a model I anova would be appropriate are:
- Time of death of amphipod crustaceans being suffocated in plain seawater, a glucose solution, or a mannose solution. The three different solutions are the treatments, and the question is whether amphipods die more quickly in one solution than another. If you find that they die the fastest in the mannose solution, you would expect them to die the fastest in mannose if you repeated the experiment.
- Amounts of a particular transcript in tissue samples from arm muscle, heart muscle, brain, liver and lung, with multiple samples from each tissue. The tissue type is the treatment, and the question you are interested in is which tissue has the highest amount of transcript. Note that "treatment" is used in a rather broad sense. You didn't "treat" a bunch of cells and turn them into brain cells; you just sampled some brain cells.
- The tastiness of peaches from 10 different peach trees, if you want to use cuttings from the tree or trees with the tastiest peaches to start an orchard.
If you have significant heterogeneity among the means in a model I anova, the next step (if there are more than two groups) is usually to try to determine which means are significantly different from other means. In the amphipod example, if there were significant heterogeneity in time of death among the treatments, the next question would be "Is that because mannose kills amphipods, while glucose has similar effects to plain seawater? Or does either sugar kill amphipods, compared with plain seawater? Or is it glucose that is deadly?" To answer questions like these, you will do either planned comparisons of means (if you decided, before looking at the data, on a limited number of comparisons) or unplanned comparisons of means (if you just looked at the data and picked out interesting comparisons to do).
In a model II anova (also known as a random-effects model anova), the groups are identified by some characteristic that is not interesting; they are just groups chosen from a larger number of possible groups. If there is heterogeneity among the group means and you repeat the experiment, you would expect to see heterogeneity again, but you would not expect to see the same pattern of differences. The group labels are generally arbitrary (such as "family A, family B, family C"). You are interested in the amount of variation among the means, compared with the amount of variation within groups. Examples of data for which a model II anova would be appropriate are:
- Repeated measurements of glycogen levels in each of several pieces of a rat gastrocnemius muscle. If variance among pieces is a relatively small proportion of the total variance, it would suggest that a single piece of muscle would give an adequate measure of glycogen level. If variance among pieces is relatively high, it would suggest that either the sample preparation method needs to be better standardized, or there is heterogeneity in glycogen level among different parts of the muscle.
- Sizes of treehoppers from different sibships, all raised on a single host plant. If the variance among sibships is high relative to the variance within sibships (some families have large treehopppers, some families have small treehoppers), it would indicate that heredity (or maternal effects) play a large role in determining size.
- The tastiness of peaches from 10 different peach trees, if you want to estimate how much of the variation in peach tastiness is due to the tree, and how much is due to variation among peaches within each tree. If most of the variation in tastiness is among peaches within each tree, then using cuttings from the best tree won't do much to improve the tastiness of the peaches.
If you have significant heterogeneity among the means in a model II anova, the next step is to partition the variance into the proportion due to the treatment effects and the proportion within treatments.
How to tell the difference
If you are going to follow up a significant result with planned or unplanned comparisons of means, it's model I; if you are going to follow up by partitioning the variance, it's model II. Sometimes it's not obvious which model to use; I've seen many examples of researchers partitioning the variance after a model I anova, or doing comparisons of means after a model II anova, just because their software outputs both sets of numbers. I find it helpful to imagine that you've written all the observations for the measurement variable on cards, with one card for each group. At the top of each card you've written the name of the group. For example, imagine you've written the tastiness measurements for 10 peaches from tree A on one card, 10 peaches from tree B on a second card, etc. Then imagine that your scientific arch-enemy sneaks into your lab and erases the tree identification letter (A, B, C, ...) from each card. Now if one of the trees has significantly better tastiness measures than the other trees, you don't know which tree it was. If your experiment is completely ruined, and you have to wait a year until you can go back to the same peach trees and get new tastiness measures, and therefore your scientific arch-enemy is cackling with glee, that's a model I anova. But if your experiment isn't ruined—if your goal was to see how much tree-to-tree variation there was, and you can just write new arbitrary tree names on each card and still answer the question, and your arch enemy is going "Curses! Foiled again!"—that's a model II anova.
Sokal and Rohlf, pp. 201-205.
Zar, pp. 184-185.
This page was last revised August 31, 2009. Its address is http://udel.edu/~mcdonald/statanovamodel.html. It may be cited as pp. 127-129 in: McDonald, J.H. 2009. Handbook of Biological Statistics (2nd ed.). Sparky House Publishing, Baltimore, Maryland.
©2009 by John H. McDonald. You can probably do what you want with this content; see the permissions page for details.