### Biological Data Analysis:First exam study guide

This is the study guide for the first exam in Biological Data Analysis, fall 2013. The exam will be on Thursday, October 3. You may not use your notes or books during the exam; if English is your second language, you may use a dictionary. You will not need to use a calculator.

Study your lecture notes and the web pages (available at the class syllabus page) on all the topics covered before the exam.

There will be four kinds of questions:

1. Identify the variables. I will describe an experiment and ask you to identify the variables and say what kind they are (nominal, measurement, or ranked). Do not include variables that aren't mentioned in the question; for example, if I say "I had 20 people eat spaghetti and 20 people eat steak for dinner, then timed them running 5 kilometers the next morning," the only variables are dinner type (nominal) and running time (measurement). Age, sex, height, weight, running experience, etc. would be important variables if you were really doing this experiment, but they are not mentioned in the question, so don't include them in your answer.

2. Questions about probability, hypothesis testing, and power analysis. You should know what a P-value means and how to interpret the results of statistical tests. You should understand why we do power analysis, what is needed to do one, and how to interpret the results.

3. Questions about measures of central tendency (mean and median); measures of dispersion (variance and standard deviation); measures of confidence (standard error and confidence intervals); normality, homoscedasticity, and data transformations.

4. Choose a statistical test. I will describe an experiment, and you should say which test would be most appropriate. At this point, your choices are exact test of goodness-of-fit, chi-square test of goodness-of-fit, G-test of goodness-of-fit, Fisher's exact test, chi-square test of independence, G-test of independence, repeated G-tests of goodness-of-fit, Cochran-Mantel-Haenszel test, and two-sample t-test. I will not ask questions where a one-sample t-test is the correct answer.

In class, we mostly talked about the exact binomial test of goodness-of-fit (only two values of the nominal variable), but the exams may include exact tests of goodness-of-fit with more than two values. We mostly talked about Fisher's exact test for 2x2 tables, but the exams may include Fisher's exact tests for larger tables. Remember the rule of thumb for this class: if the total N is less than 1000 for a test of goodness-of-fit or independence, you should use an exact test.

You must give only one answer, even if more than one would be correct; don't say "chi-square test of independence or G-test of independence," pick one or the other. You must give the full name of the test; don't just say "chi-square test."

If a question uses biological words or concepts that are unfamiliar to you (for example, you don't know what a "promoter element" is), please ask me for help. You are being tested on your knowledge of methods used to analyze biological experiments, not on your knowledge of biology.

The exam will consist of about 15 to 20 short-answer questions. Here are some examples. To see the correct answer to a question, click on the "Answer" button. Don't bother typing your answer into the box, it won't do anything.

1. You are interested in the effects of fertilizer on mitosis in onion root tips. In an onion root tip grown without fertilizer, you count 700 cells in interphase, 283 cells in prophase, 27 cells in metaphase, 53 cells in anaphase, and 109 cells in telophase. In an onion root tip grown with fertilizer, you count 941 cells in interphase, 573 cells in prophase, 68 cells in metaphase, 113 cells in anaphase, and 275 cells in telophase. What statistical test would you use to analyze these data?
What would your statistical null hypothesis be?
What are the variables in this experiment?
2. You want to know what determines handedness in humans. Out of 216 children of women who smoked during pregnancy, 37 are left-handed. Out of 587 children of women who did not smoke during pregnancy, 43 are left-handed. You do the appropriate statistical test, and the P-value is 0.001. How would you interpret this result?
3. You want to know whether your 6-month-old niece exhibits a preference for one hand (right-handed or left-handed), so you hold out your finger. She grabs it with her right hand 19 times and her left hand 13 times. What test would you use?
4. Male fireflies fly around and flash their lights as a mating signal; females sit on the ground and flash in response when they see an attractive male. You put 200 female fireflies on the floor of a cage, put one male firefly in the cage, and see how many of the 200 females respond by flashing at that male. You test 10 male fireflies this way, and you want to know whether there is significant variation among them in their attractiveness to the females. What test would you use?
5. You are trying to find promoter elements involved in turning on the expression of genes involved in the budding process in yeast. You have identified 570 genes that are highly expressed only in budding yeast, and have randomly chosen 612 other genes whose expression does not increase during budding. Of the first set, 357 have the hypothetical Scratchy element (sequence CATCATCAT) within 300 basepairs of the start, while 243 of the second set of genes have the Scratchy element. What test should you use?
6. You want to know whether the gene that codes for mannose-6-phosphate isomerase (MPI) is expressed differently in liver tumors than in normal livers. You take one biopsy from each of 17 normal livers and 32 cancerous livers and measure the amount of MPI mRNA in each one. Which test should you use?
7. How could you make the preceeding experiment (on livers) less senstive to possible deviations from the assumptions of the test?
8. When asked what we could learn about the mind of God from observing nature, J.B.S. Haldane replied that God must have an "inordinate fondness for beetles." Of the 24 women taking this class in 2006, 9 selected a beetle as their favorite insect. Two out of 14 men gave some kind of beetle as their favorite insect. What test should you use to determine whether men are significantly less likely to prefer beetles than women?
9. You have data on the favorite insect of students in this class, as described in the above question, for five different years. What test should you use to determine whether men are significantly less likely to prefer beetles than women?
10. You want to know how smart 5-year-olds are, so you observe a birthday party at a duckpin bowling alley. Of 19 children, 12 put the left shoe on their left foot and right shoe on their right foot, while the other 7 put their shoes on the wrong foot. What test should you use?
11. Glacier-Waterton International Park is in Montana and Alberta. While backpacking through the park, you see 18 black bears and 0 grizzly bears in the Montana side of the park; after crossing the border into Canada, you see 24 black bears and 6 grizzly bears in the Alberta side of the park. What test would you to test whether there a difference between the two parts of the park in the relative abundance of the two bear species?
12. Because of the long tail feathers, male swallows mount the females from either the right or the left. You want to know whether they have a preference for one side, so you observe 172 pairs of mating swallows. 78 males mount from the right side, while 94 mount from the left. What test would you use?
13. You are planning to do experiments on chicken feed with different ratios of corn meal to soybean meal. To prepare for these experiments, you buy 20 bags of corn meal and 14 bags of soybean meal and put them in a cool, dry place. A few weeks later, when you finally decide to start mixing up chicken feed, you notice that 12 bags of corn meal have moth holes, while 2 bags of soybean meal have moth holes. What are the variables, and what type are they?
14. You are planning an experiment to test whether extra-thick socks help prevent injuries in serious amateur runners. You'll have two groups of runners, one wearing extra-thick socks and one wearing regular socks. At the end of the three months, you'll record how many of the people in each group had running-related injuries. What information do you need to decide how many runners you need for the study?
15. You are trying to see whether the genes Jam-1 and Pax-6 are genetically linked in zebrafish. You breed two individuals who are heterozygous for visible, dominant mutations at both genes, and you get 160 offspring. If the two genes are unlinked, you'd expect 10 fish that were normal/normal, 30 that were normal at Jam-1 and mutant at Pax-6, 30 that were mutant at Jam-1 and normal at Pax-6, and 90 that were mutant/mutant. What test would you use?
16. Two amphipod crustaceans live high on sandy beaches in Delaware, Talorchestia longicornis and Talorchestia megalophthalma. You want to know whether the proportion of each species is different on different beaches, so you collect 400 amphipods at Rehoboth Beach, 350 at Dewey Beach, 435 at Fenwick Island, and 372 at Cape Henlopen, and you count the number of individuals of each species at each beach. What test would you use?
17. You want to know whether the presence of the malaria parasite (Plasmodium) in mosquitoes affects the West Nile virus. You collect 1200 mosquitoes. Half of them contain Plasmodium and one-third contain West Nile virus. If the probabilities of carrying Plasmodium and West Nile are independent, what is the probability that a mosquito will carry both?
18. You want to know whether mice can see colors. Twenty times a day for two weeks, you put a piece of mouse food in a small red box and put it in a cage with one mouse. The mouse can tip the box over and get the food out. At the same time, you also put mouse food in a green box; it looks and smells the same as the red box, but is glued shut so the mouse can't get the food out. At the end of the two weeks, you put the two boxes in with the mouse for 20 more times. The mouse pushes over the red box first 12 times and the green box 8 times. What test would you use?
19. When beaches are replenished by dumping new sand on them, the beach animals get covered up and may die. You want to know whether the type of sand makes a difference. You put 200 snails (Ilyanassa obsoleta) at the bottom of each of two large containers, then you put 20 cm of fine sand in one container, and 20 cm of coarse sand in the other container. After one hour, you count the number of snails that have crawled to the surface in each container. What test would you use?
20. When a click beetle is on its back, it rapidly flexes its body with an audible "click," flipping itself into the air and hopefully landing right-side-up. You want to know whether this flipping is random or whether the beetles tend to land on their feet. You catch a click beetle, put it on its back, and watch it click. You repeat this 125 times. The beetle lands on its feet 75 times and on its back 50 times. What test would you use?
21. You want to know whether aspirin taken during pregnancy has an effect on the sex of offspring. You ask 1072 new mothers whether they took aspirin during the first three months of their pregnancy, and you also ask them whether they had a boy or a girl. What test would you use?
22. You want to know whether aspirin taken during pregnancy has an effect on the sex of offspring. You ask 1072 new mothers how many times they took aspirin during the first three months of their pregnancy, and you also ask them whether they had a boy or a girl. What are the variables in this experiment, and what kind are they?
23. When beaches are replenished by dumping new sand on them, the beach animals get covered up and may die. You want to know whether the size of the animal makes a difference. You put 20 mole crabs (Emerita talpoida) at the bottom of each of two large containers, then you put 20 cm of fine sand in one container, and 20 cm of coarse sand in one container. When the first crab appears at the surface, you weigh it and write down the weight; you continue recording weights, in order, until all of the crabs have appeared at the surface. What are the variables in this experiment, and what kind are they?
24. When a click beetle is on its back, it rapidly flexes its body with an audible "click," flipping itself into the air and hopefully landing right-side-up. You want to know whether this flipping is random or whether the beetles tend to land on their feet. You catch a click beetle, put it on its back, and watch it click. You repeat this 12 times. The beetle lands on its feet 8 times and on its back 4 times. You analyze this using the appropriate statistical test, and you get a P-value of 0.11. What does this P-value mean?

Return to the Research Methods in Biology syllabus