You must type this and all other homework assignments. Do not e-mail the assignment to me; turn it in early (at 322 Wolf) for a foreseeable absence, or turn it in late after an unexpected absence from class.
1. Get five coins and put them in a container with a lid, so that you can flip all five coins at once by shaking the container and then letting the coins fall to the bottom. If you flip the five coins, and the probability of one coin being "tails" is 0.50, what is the probability of getting "tails" on all five coins?
2. Now you are going to do an experiment to see whether thinking about tails makes tails more likely to appear. You will think about tails, shake your coin container, and see how many tails you get. What is the biological null hypothesis? The biological alternative hypothesis? The statistical null hypothesis? The statistical alternative hypothesis?
Answer: Biological null: I cannot control coin flips with my mind.
Biological alternative: I can control coin flips with my mind.
Statistical null: Half or fewer of the coin flips will be tails. (We almost always do two-tailed tests; the null hypothesis for a two-tailed test would be that half of the flips were tails. However, because we would only consider an excess of tails to be evidence of mind control, we do a one-tailed test and consider an excess of heads to fit the null hypothesis.)
Statistical null: More than half of the flips will be tails.
3. Now do the experiment. Think about tails, flip your five coins, and record how many tails (0 to 5) you got.
Answer: When I did it, I got 2 tails. Out of the 71 people in the class, 10 said they got 5 tails on the first six flips. The null hypothesis is that 0.03125×71=2.2 people would be that lucky. Applying the exact test of goodness-of-fit (10 first-flippers, 61 non-first-flippers, expected proportions 0.03125 and 0.96875), the one-tailed P-value is 0.00007. So either some of your really can control coin flipping with your minds, or some of you lied about your results.
4. If you got five tails in question 3, skip question 4 and just answer question 5. If you didn't get five tails, maybe you just weren't doing the experiment right. So work your way down this list, record the number of tails for each experiment, and keep going until you get five tails, then stop and go to question 5. Turn in a table of your results, with rows labelled "a" through "o" and one column for each condition (if you have to shake longer, etc.). Also turn in your drawings, if you get that far.
Answer: Here are the results I got:
a 3 3
b 3 3
c 2 3
d 2 5
Out of 71 students, there should have been 22.5 who had 5 tails before getting to condition "L", where you had to draw a coin. Instead, there were 50 students. The P-value from the exact test of goodness-of-fit is 3×10−11. So it looks like over one-third of the class either can control coins with your minds, or lied about their data.
5. Based on the results of just your last experiment (the one that gave you five tails), what would a stupid or evil biologist conclude?
Answer: I would ignore the first 20 times I flipped the coins and conclude that there is statistically significant (P=0.03) evidence that if I think about the word "TAILS" in all caps while flipping nickels, I can control coin flips with my mind.
6. You're not stupid or evil (I hope), so what do you conclude?
Answer: Knowing that it took 21 sets of 5 flips to get one that was all tails, I would conclude that the data fit the null hypothesis quite well, and the 5 tails while thinking about the word "TAILS" and flipping nickels was just due to chance. Even if I'd gotten 5 tails on the first flip, I would have known that it's very unlikely that I can really control coins with my mind, so I would have demanded a P-value much smaller than 0.05 to reject the null hypothesis.
7. Download the balance data set, which has the data everyone in the class collected for homework 1. Pick two of the nominal variables that have two values (sex, arm on top, thumb on top, stood on left or right foot, left or right handed). Write down the statistical null hypothesis in terms of the two variables. Then analyze just the data on eight people that you collected, using each of the three tests of independence that you've learned. Give your raw data (the numbers of each of the four combinations of your two variables) and the P-values for the three tests in a nice little table (do NOT just copy over the spreadsheets).
Answer: Using sex and arm folding as an example, the statistical null hypothesis is "The proportion of males who fold with their right arm on top is equal to the proportion of females who fold with their right arm on top." You could also put it the other way around: "The proportion of right-on-top folders who are female is equal to the proportion of left-on-top folders who are female." The null hypothesis is NOT that the proportion is 50%. If you were just looking at arm folding, using a test of goodness-of-fit to test the null hypothesis that the proportion of right-on-top was 50% would be interesting. But when you're using a test of independence to analyze two nominal variables, you're testing the null hypothesis that the proportion for one variable is the same for the different values of the other variable.
Answer: Here's the results for one student's data on sex and arm folding:
F M Fisher's chi G arm L 0 2 0.464 0.206 0.132 arm R 3 3
8. Do the same three tests of independence on the same two variables as in question 7, only use the entire data set from everyone in the class. Present the raw data and P-values in another nice little table.
Here's the results for all of the combinations:
F M Fisher's chi G arm L 132 124 0.208 0.186 0.187 arm R 186 140 F M thumb L 188 154 0.866 0.848 0.848 thumb R 130 110 F M foot L 122 118 0.129 0.122 0.122 foot R 196 146 F M hand L 24 29 0.193 0.155 0.156 hand R 293 235 arm L arm R thumb L 162 180 0.052 0.050 0.049 thumb R 94 146 arm L arm R foot L 107 133 0.865 0.808 0.808 foot R 149 193 arm L arm R hand L 17 36 0.081 0.065 0.062 hand R 239 289 thumb L thumb R foot L 142 98 0.932 0.868 0.868 foot R 200 142 thumb L thumb R hand L 28 25 0.381 0.349 0.352 hand R 314 214 foot L foot R hand L 21 32 0.884 0.814 0.814 hand R 218 310
9. Write a few sentences about how similar the three P-values are in question 7 to each other, how similar the three P-values are in question 8 to each other, whether you got significance in any of the tests, and if so, what you think that means biologically.
Answer: The P-values in question 8 are not identical, but they are very similar to each other. This illustrates that with large sample sizes, it doesn't really matter whether you use the chi-square, G, or Fisher's exact test of independence, they will all give you about the same result. With the very small sample size in question 7, the P-values are very different; this illustrates why it's important to use the more accurate Fisher's exact test for small sample sizes. For question 8, none of the results were statistically significant at the P<0.05 level when using Fisher's exact test. For each pair of variables, you could then say that "There's no statistically significant evidence that variable X is related to variable Y." For example, you could say that ""There's no statistically significant evidence that which arm people fold on top is related to sex." To be strictly correct, you shouldn't say "Which arm people fold on top is not related to sex," just that there's "no statistically significant evidence" that they're related. This is because it's possible that they're related, but the difference in proportions was too small to be detected with this sample size.
The relationship between arm folding and hand clasping is right at the borderline of significance; P is slightly above 0.05 using Fisher's test, but slightly below it using the G test. 63% of the left-arm people are left-thumb, while only 55% of the right-arm people are left-thumb. With this sort of result, you would say that although it's not quite statistically significant, it's worth further research.
Return to the Biological Data Analysis syllabus