# Randomization test of independence

### When to use it

The randomization test of independence is used when you have two nominal variables. A data set like this is often called an "R×C table," where R is the number of rows and C is the number of columns. The randomization test is more accurate than the chi-squared test or G-test of independence when the expected numbers are small. See the web page on small sample sizes for further discussion.

Fisher's exact test would be just as good as a randomization test, but there may be situations where the computer program you're using can't handle the calculations required for the Fisher's test.

### Null hypothesis

The null hypothesis is that the relative proportions of one variable are independent of the second variable. For example, if you counted the number of male and female mice in two barns, the null hypothesis would be that the proportion of male mice is the same in the two barns.

### How it works

Fisher's exact test works by calculating the probabilities of all possible combinations of numbers in an R×C table, then adding the probabilities of those combinations that are as extreme or more extreme than the observed data. As R and C get larger, and as the total sample size gets larger, the number of possible combinations increases dramatically, to the point where a computer may have a hard time doing all the calculations in a reasonable period of time.

The randomization test works by generating random combinations of numbers in the R×C table, with the probability of generating a particular combination equal to its probability under the null hypothesis. For each combination, the Pearson's chi-square statistic is calculated. The proportion of these random combinations that have a chi-square statistic equal to or greater than the observed data is the P-value.

Because it is taking a random sample of all possible combinations, the randomization test will give slightly different estimates of the P-value every time you run it. The more replicates you run, the more accurate your estimate of the P-value will be. You might want to start with a small number of replicates, such as 1,000, to be sure everything is working properly, then change the number of replicates to 100,000 or even 1,000,000 for your final result.

### Examples

Custer and Galli (2002) flew a light plane to follow great blue herons (Ardea herodias) and great egrets (Casmerodius albus) from their resting site to their first feeding site at Peltier Lake, Minnesota, and recorded the type of substrate each bird landed on.

```
Heron   Egret
Vegetation    15      8
Shoreline     20      5
Water         14      7
Structures     6      1

```

A randomization test with 100,000 replicates yields P=0.54, so there is no evidence that the two species of birds use the substrates in different proportions.

 Slippery dick, Halichoeres bivittatus, a common prey item of moray eels.

Young and Winn (2003) counted prey items in the stomach of the spotted moray eel, Gymnothorax moringa, and the purplemouth moray eel, G. vicinus. They identified each eel they saw, and classified the locations of the sightings into three types: those in grass beds, those in sand and rubble, and those within one meter of the border between grass and sand/rubble. The number of prey items are shown in the table:

```                       G. moringa     G. vicinus
Slippery dick              10             6
Unidentified wrasses        3             7
Moray eels                  1             1
Squirrelfish                1             1
Unidentified fish           6             3
Oval urn crab              31            10
Emerald crab                3             2
Portunus crab spp.          1             0
Arrow crab                  1             0
Unidentified crabs         15             1
Spiny lobster               0             1
Octopus                     3             2
Unidentified                4             1

```

The nominal variables are the species of eel (G. moringa or G. vicinus) and the prey type. The difference in stomach contents between the species is not significant (randomization test with 100,000 replicates, P=0.11).

There are a lot of small numbers in this data set. If you pool the data into fish (the first five species), crustaceans (crabs and lobster), and octopus+unidentified, the P-value from 100,000 randomizations is 0.029; G. moringa eat a higher proportion of crustaceans than G. vicinus. Of course, it would be best to decide to pool the data this way before collecting the data. If you decided to pool the numbers after seeing them, you'd have to make it clear that you did that, writing something like "After seeing that many of the numbers were very small when divided into individual species, we also analyzed the data after pooling into fish, crustaceans, and other/unidentified."

### Graphing the results

You plot the results of a randomization test the same way would any other test of independence.

### Similar tests

The chi-squared test of independence or the G-test of independence may be used on the same kind of data as a randomization test of independence. When some of the expected values are small, Fisher's exact test or the randomization test is more accurate than the chi-squared or G-test of independence. If all of the expected values are very large, Fisher's exact test and the randomization test become computationally impractical; fortunately, the chi-squared or G-test will then give an accurate result. See the web page on small sample sizes for further discussion.

If the number of rows, number of columns, or total sample size become too large, the program you're using may not be able to perform the calculations for Fisher's exact test in a reasonable length of time, or it may fail entirely. I'd try Fisher's test first, then do the randomization test if Fisher's doesn't work.

### How to do the test

I haven't written a spreadsheet for this test.

#### Web pages

I don't know of a web page that will perform this test.

#### SAS

Here is a SAS program that uses PROC FREQ to do the randomization test of independence. The example uses the data on heron and egret substrate use from above. In the statement exact chisq / mc n=100000, "mc" tells SAS to do randomization (also known as Monte Carlo simulation), and "n=100000" tells it how many replicates to run.

```
data birds;
input bird \$ substrate \$ count;
cards;
heron vegetation 15
heron shoreline  20
heron water      14
heron structures  6
egret vegetation  8
egret shoreline   5
egret water       7
egret structures  1
;
proc freq data=birds;
weight count;
tables bird*substrate / chisq;
exact chisq / mc n=100000;
run;

```

The results of the randomization test are labelled "Pr >= ChiSq"; in this case, P=0.5392.

```
Monte Carlo Estimate for the Exact Test

Pr >= ChiSq                 0.5392
99% Lower Conf Limit        0.5351
99% Upper Conf Limit        0.5432

Number of Samples           100000
Initial Seed             952082114

```

### Power analysis

Unless your numbers are very small, the power analysis described for the chi-square test of independence should work well enough.

### Reference

Picture of slippery dick from Dan Greenspan's blog.

Custer, C.M., and J. Galli. 2002. Feeding habitat selection by great blue herons and great egrets nesting in east central Minnesota. Waterbirds 25: 115-124.

Young, R.F., and H.E. Winn. 2003. Activity patterns, diet, and shelter site use for two species of moray eels, Gymnothorax moringa and Gymnothorax vicinus, in Belize. Copeia 2003: 44-55.