Biological Data Analysis: Exam 1 answers

Here are the answers to exam 1. For some of the questions, I have provided explanatory material in regular type, and the answer in bold; all you need to write down is the answer. If you don't understand why your answer was wrong, you may e-mail me, talk to me before or after class, or set up a time to talk to me in my office. The exam was worth 15 points, so each question was worth 0.8 points.

1. two nominal variables (hair whorl direction, sexual preference), null hypothesis is the same proportion of clockwise vs. counterclockwise whorls in each preference, total sample size is greater than 1000: chi-square or G-test of independence. This kind of research has actually been done, although as far as I know it didn't include bisexual men.

2. Two nominal variables (music vs. no music, arrived home vs. didn't arrive home), null hypothesis is same proportions of pigeons arriving home with or without music, total sample size (79+79=158) is less than 1000: Fisher's exact test of independence

3. I thought the obvious answer here was the presence or absence of an mp3 player and headphones is a possible confounding variable; it could be controlled by making all of the pigeons wear the mp3 player and headphones, and just not turning it on for half the pigeons. Most of you picked other confounding variables, such as the age of the pigeons, and said you would control it by using pigeons of the same age. I accepted this.

4. One nominal variable (mated vs. unmated male), one measurement variable (dung ball rolling speed), null hypothesis is the mean dung-ball rolling speed is the same for mated and unmated males: two-sample t-test; one-way anova would also be acceptable if you see a question like this on future exams. You got points off if you just said "t-test"; you need to specify the two-sample t-test or one-sample t-test. And I accepted "two-tailed t-test" and other variants as long as they had "two" and "t-test," but really I should have taken some points off; I was in a very generous mood when I graded. Here's a picture I took of dung beetles this summer.

6. (oops, out of order on the exam) One measurement variable (distance moved), null hypothesis is that the mean distance is zero: one-sample t-test.

5. glasses vs. no glasses (nominal), sex of housefly (nominal), position of fly in vomit string (ranked). I'd be very surprised if frogs vomit in such an orderly fashion, but it's hard to think of biological ranked variables.

7. Three nominal variables: pheromone trap vs. light trap, ladybug vs. stink bug, which yard the traps are in; null hypothesis is that the mean difference in proportion of ladybugs between pheromone trap and lighted trap is zero across the 21 yards: Cochran-Mantel-Haenszel test.

8. One nominal variable (seed vs. feces), null hypothesis is that the beetle buries the seed in half of the trials, total sample size is less than 1000: exact test of goodness-of-fit. I'm not making up these feces-imitating seeds.

9. Singing vs. no singing (nominal), wart vs. no wart (nominal). Note that the question says you count the number of people with warts (and could then figure out the number without warts); the question did not say that you count the number of warts on each person, which would have been a measurement variable. Also note that I made this up, I don't really think singing to your warts will work. If you want to try it anyway, please don't sing in class.

10. One nominal variable (TV vs. no TV), one measurement variable (weight): two-sample t-test. One-way anova will also be correct if you see a question like this on future exams.

11. Two nominal variables, cilantro soapy vs. not soapy, men vs. women; null hypothesis is same proportion of soapy taste in men and women; total sample size (15+21+62+57=155) less than 1000: Fisher's exact test of independence. You could have kept the data from the three days separate, treating "day" as a third nominal variable, so I also accepted Cochran-Mantel-Haenszel test.

12. One nominal variable (right or left), null hypothesis is that half of the mountings are from the right, total sample size is greater than 1000: chi-square or G-test of goodness-of-fit

13. Two nominal variables (east or west half of tank, magnetic shielding on or off), null hypothesis is equal proportions of salmon in the east half when shielded vs. not shielded, total sample size less than 1000: Fisher's exact test. Note that when a possible answer is a pun, it's not always the correct answer, so be careful on future questions involving fish, tea, or G-spots.

14. Because balance time is a measurement variable, you need four numbers: Alpha (also known as critical value); beta or power; effect size; standard deviation. Note that you shouldn't put down both beta and power, as they're just two different ways of measuring the same thing (beta is 1 minus power).

15. The data are highly skewed, with one or a few extreme values in one direction; or The measurement variable is something you wait for, like lifespan, and you don't want to wait until the end. Either answer is acceptable; you don't get extra credit for putting down both.

16. Amount of coffee (measurement), number of tomatoes (measurement), variety of tomato (nominal). The amount of coffee is a measurement variable because there are 8 different values for the amount, from 0 to 70 ml per day in increments of 10.

17. ...of getting a deviation from the null hypothesis as big as you observed, or bigger, if the null hypothesis were true.. I'm not real big on memorization, but you might as well memorize the definition of P-value; you'll probably see a question requiring you to know it on every exam.

18. A higher standard error of the mean growth rate near the road means that the estimate of the mean is likely to be further from the true mean. Note that you got some points off for saying that the mean with the higher standard error is further from the true mean. It might, by chance, be very close to the true mean, and closer than the mean with the smaller standard deviation. You also got points off if you said the higher standard error meant that there was a higher standard deviation near the road; that might be the reason for the higher standard error, or it might be that the sample size was smaller near the road.

Return to the Biological Data Analysis syllabus

Return to John McDonald's home page

This page was last revised October 14, 2015. Its URL is http: //