# Probability

The basic idea of a statistical test is to identify a null hypothesis, collect some data, then estimate the probability of getting the observed data if the null hypothesis were true. If the probability of getting a result like the observed one is low under the null hypothesis, you conclude that the null hypothesis is probably not true. It is therefore useful to know a little about probability.

One way to think about probability is as the proportion of individuals in a population that have a particular characteristic. (In this case, both "individual" and "population" have somewhat different meanings than they do in biology.) The probability of sampling a particular kind of individual is equal to the proportion of that kind of individual in the population. For example, in fall 2003 there were 21,121 students at the University of Delaware, and 16,428 of them were undergraduates. If a single student were sampled at random, the probability that they would be an undergrad would be 16,428 / 21,121, or 0.778. In other words, 77.8% of students were undergrads, so if you'd picked one student at random, the probability that they were an undergrad would have been 77.8%.

When dealing with probabilities in biology, you are often working with theoretical expectations, not population samples. For example, in a genetic cross of two individual Drosophila melanogaster that are heterozygous at the white locus, Mendel's theory predicts that the probability of an offspring individual being a recessive homozygote (having white eyes) is one-fourth, or 0.25. This is equivalent to saying that one-fourth of a population of offspring will have white eyes.

### Multiplying probabilities

 Drosophila melanogaster with an allele at the Antennapedia locus that causes it to have legs where its antennae should be.

You could take a semester-long course on mathematical probability, but most biologists just need a few basic principles. The probability that an individual has one value of a nominal variable AND another value is estimated by multiplying the probabilities of each value together. For example, if the probability that a Drosophila in a cross has white eyes is one-fourth, and the probability that it has legs where its antennae should be is three-fourths, the probability that it has white eyes AND leg-antennae is one-fourth times three-fourths, or 0.25 X 0.75, or 0.1875. This estimate assumes that the two values are independent, meaning that the probability of one value is not affected by the other value. In this case, independence would require that the two genetic loci were on different chromosomes, among other things.

The probability that an individual has one value OR another, MUTUALLY EXCLUSIVE, value is found by adding the probabilities of each value together. "Mutually exclusive" means that one individual could not have both values. For example, if the probability that a flower in a genetic cross is red is one-fourth, the probability that it is pink is one-half, and the probability that it is white is one-fourth, then the probability that it is red OR pink is one-fourth plus one-half, or three-fourths.

### More complicated situations

When calculating the probability that an individual has one value OR another, and the two values are NOT MUTUALLY EXCLUSIVE, it is important to break things down into combinations that are mutually exclusive. For example, let's say you wanted to estimate the probability that a fly from the cross above had white eyes OR leg-antennae. You could calculate the probability for each of the four kinds of flies: red eyes/normal antennae (0.75 X 0.25 = 0.1875), red eyes/leg-antennae (0.75 X 0.75 = 0.5625), white eyes/normal antennae (0.25 X 0.25 = 0.0625), and white eyes/leg-antennae (0.25 X 0.75 = 0.1875). Then, since the last three kinds of flies are the ones with white eyes or leg-antennae, you'd add those probabilities up (0.5625 + 0.0625 + 0.1875 = 0.8125).

### When to calculate probabilities

While there are some kind of probability calculations underlying all statistical tests, it is rare that you'll have to use the rules listed above. About the only time you'll actually calculate probabilities by adding and multiplying is when figuring out the expected values for a goodness-of-fit test.

Sokal and Rohlf, pp. 62-71.

Zar, pp. 48-63.

### References

Picture of Antennapedia from Mutations homéotiques de la drosophile