Mantel-Haenzel test of homogeneity of repeated tests of independence

It is often useful to analyze replicated 2x2 tables. For example, you might compare the frequency of smoking vs. non-smoking in teenage boys vs. girls in several different cities. It would be simple to add up the numbers across cities and test the resulting 2x2 table, but before you do that, it is important to make sure the individual frequency tables are homogeneous. If the individual frequency tables are heterogeneous (show significantly different patterns), you should not combine them.

For example, imagine you've taken surveys in Baltimore and Philadelphia. If more boys than girls smoke in Baltimore, while more girls than boys smoke in Philadelphia, it would be misleading to combine the numbers and infer that the rates of smoking are not significantly different between the two sexes. Instead, it would be better to interpret this imaginary example as indicating that boys and girls do smoke in different proportions, but the direction of difference depends on the city.

Example

McDonald and Siebenaller (1989) surveyed allele frequencies at the Lap locus in the mussel Mytilus trossulus on the Oregon coast. At four estuaries, samples were taken from inside the estuary and from a marine habitat outside the estuary. Alleles other than Lap94 were pooled into a "non-94" class.

For each 2x2 table, the adjusted log-odds ratio and the weight are calculated. When calculating these values, 0.5 is added to each frequency as a continuity correction. The weighted mean of the log-odds ratio is calculated, then the weighted squared deviations are calculated and summed. This weighted sum of the squared deviations is chi-square distributed; the degrees of freedom equals the number of 2x2 tables minus one.

 . A B C D E F G 1 . allele 94 non-94 log-odds ratio weight log-odds ratio X weight weighted squared deviation 2 Tillamook marine 56 40 =ln (((b2+0.5) * (c3+0.5)) / ((b3+0.5) * (c2+0.5))) =1/(1/ (b2+0.5) + 1/(b3+0.5) + 1/(c2+0.5) + 1/(c3+0.5)) =d2*e2 =e2 * (d2 d\$14)^2 3 Tillamook estuarine 69 77 . . . . 4 . . . . . . . 5 Yaquina marine 61 57 =ln (((b5+0. 5) * (c6+0.5)) / ((b6+0. 5) * (c5+0.5))) =1/(1/ (b5+0. 5) + 1/(b6+0. 5) + 1/(c5+0. 5) + 1/(c6+0.5)) =d5*e5 =e5 * (d5 d\$14)^2 6 Yaquina estuarine 257 301 . . . . 7 . . . . . . . 8 Alsea marine 73 71 =ln (((b8+0.5) * (c9+0.5)) / ((b9+0.5) * (c8+0.5))) =1/(1/ (b8+0.5) + 1/(b9+0.5) + 1/(c8+0.5) + 1/(c9+0.5)) =d8*e8 =e5 * (d5 d\$14)^2 9 Alsea estuarine 65 79 . . . . 10 . . . . . . . 11 Umpqua marine 71 55 =ln (((b11+0. 5) * (c12+0.5)) / ((b12+0. 5) * (c11+0.5))) =1/(1/ (b11+0.5) + 1/ (b12+0.5) + 1/ (c11+0.5) + 1/ (c12+0.5)) =d11*e11 =e5 * (d5 d\$14)^2 12 Umpqua estuarine 48 48 . . . . 13 . . . . . . . 14 . . weighted mean =sum (f5:f11) / sum (e5:e11) . weighted sum of squares =sum(g2:g11) 15 . . . . . P =chidist (g14,3)

If the test probability were significant, it would indicate that the repeated tables were heterogenous. Different estuaries would have significantly different patterns of variation in allele frequency between the estuarine and marine habitats. In this example, the weighted sum of squares is 0.52; with three degrees of freedom, the P value is 0.91. There is no significant evidence rejecting the null hypothesis of homogeneity, so it is acceptable to test the overall pattern of difference between estuarine and marine allele frequencies. This can be done either using the Mantel-Haenzel test statistic (see p. 766 of Sokal and Rohlf 1995 for the equation) or by adding the numbers in the four tables to get a single 2x2 table and testing it using a G-test of independence.

Similar tests

A similar analysis of replicated tables could be done using log-linear models; see pp. 746-755 of Sokal and Rohlf 1995. The Mantel-Haenzel approach is easier to perform and interpret, but it has the disadvantage that it is limited to repeated 2x2 tables.

Textbook reference

Sokal and Rohlf 1995, pp. 764-766.