### Biological Data Analysis:Second exam study guide

This is the study guide for the second exam in Biological Data Analysis, fall 2018. The exam will be on Thursday, Oct. 25. You may not use your notes or textbook during the exam; if English is your second language, you may use a dictionary. You will not need a calculator.

The exam is cumulative; several of the questions will be about material covered in the first part of the semester. You should look at the first exam and the first study guide again.

You should primarily study your lecture notes, the web pages on different topics (linked from the syllabus), and the homework assignments. In addition to the topics covered on the first exam, you should be familiar with:

• Mean and median
• Standard deviation, standard error, and confidence intervals
• One-sample t-test
• Student's two-sample t-test
• Welch's two-sample t-test
• Fisher's one-way anova
• Welch's one-way anova
• Partitioning of variance components
• Tukey-Kramer method
• Assumptions of anova
• Data transformations
• Kruskal-Wallis test

For Welch's two-sample t-test vs. Student's two-sample t-test, you must answer Welch's if the question implies that the data are heteroscedastic and unbalanced. (A question probably won't say "The data are heteroscedastic," it would say something like "You notice that the standard deviations of the groups are very different" or "the individual observations are much more spread out in one group than in the other.") If the question does not say that the data are heteroscedastic and unbalanced, you may say either Welch's or Student's two-sample t-test. You must be aware when both are appropriate and when only Welch's is appropriate. Likewise, you must answer Welch's anova if an anova experiment is heteroscedastic and unbalanced, you may say either Welch's or Fisher's one-way anova when the question does not imply that the data are heteroscedastic or unbalanced.

When there is one nominal variable with two values and one measurement variable, you may say either two-sample t-test (Student's or Welch's, as appropriate) or Fisher's or Welch's one-way anova. If the nominal variable has more than two values, you must say Fisher's or Welch's one-way anova.

The exam will consist of about 15 to 20 short-answer questions. Most of them will consist of me describing an experiment, then asking what statistical test is appropriate.

On this exam, I will not ask you to lisk the variables in an experiment and say whether they are measurement, nominal or ranked. That is a good way to help you decide on the appropriate statistical test, however.

Here are some example questions:

1. You are interested in the effects of fertilizer on mitosis in onion root tips. In an onion root tip grown without fertilizer, you count 701 cells in interphase, 283 cells in prophase, 29 cells in metaphase, 56 cells in anaphase, and 100 cells in telophase. In an onion root tip grown with fertilizer, you count 942 cells in interphase, 576 cells in prophase, 97 cells in metaphase, 115 cells in anaphase, and 273 cells in telophase. What statistical test would you use to analyze these data?
2. You want to know whether the gene that codes for mannose-6-phosphate isomerase (MPI) is expressed differently in livers with liver cancer, livers with cirrhosis, and normal livers. You take one biopsy from each of 17 cancerous livers, 12 cirrhotic livers, and 32 normal livers and measure the amount of MPI mRNA in each one. According to what you've learned so far in this class, what are all the tests that you might be able to use to test the hypothesis that the three means are equal? What would tell you that you shouldn't use one or more of these tests?
3. You are planning to do experiments on chicken feed with different ratios of corn meal to soybean meal. To prepare for these experiments, you buy 20 bags of corn meal and 14 bags of soybean meal and put them in a cool, dry place. A few weeks later, when you finally decide to start mixing up chicken feed, you notice that 12 bags of corn meal have moth holes, while 2 bags of soybean meal have moth holes. You want to know whether moths prefer corn meal; which test should you use?
4. You want to test the effects of anabolic steroids on the muscle strength of elderly people. You put 150 old people on steroids, 200 old people on placebo, and 350 old people on no treatment. After one month, you measure the arm strength of each person. The standard deviation of arm strength for people on steroids is much higher than for the other two groups. Which test should you use?
5. You are trying to see whether the genes Jam-1 and Pax-6 are genetically linked in zebrafish. You breed two individuals who are heterozygous for visible, dominant mutations at both genes, and you get 1600 offspring. If the two genes are unlinked, you'd expect 100 fish that were normal/normal, 300 that were normal at Jam-1 and mutant at Pax-6, 300 that were mutant at Jam-1 and normal at Pax-6, and 900 that were mutant/mutant. Which test should you use?
6. Two amphipod crustaceans live high on sandy beaches in Delaware, Talorchestia longicornis and Talorchestia megalophthalma. You want to know whether the proportion of each species is different on different beaches, so you go to Rehoboth Beach, Dewey Beach, Fenwick Island, and Cape Henlopen, collect about 500 amphipods from each beach, and count the number of individuals of each species at each beach. Which test should you use?
7. You want to know the effect of light source on pumpkins. You grow 10 pumpkin plants under natural sunlight, 10 pumpkin plants under fluorescent light, and 10 pumpkin plants under incandescent light. You remove excess flowers, so each plant will have only one pumpkin. When the pumpkins are three months old, you measure the diameter of the pumpkins. Which test should you use?
8. You want to know whether keeping sheep in indoor cages affects the weight of their offspring. You weigh 30 newborn lambs from ewes kept full-time in cages, 30 lambs from ewes caged at nights only, and 30 lambs from ewes kept outdoors. What should you do next?
9. You want to breed miniature schnauzers that don't bark so much, but you don't know whether there is any genetic variation among families for barkiness. You obtain 7 litters of miniature schnauzers, raise them under similar conditions, then record how many times each dog barks when a stranger approaches it. You do this once for each dog. Which test should you use?
10. You want to know whether mice can see colors. Twenty times a day for two weeks, you put a piece of mouse food in a small red box and put it in a cage with one mouse. The mouse can tip the box over and get the food out. At the same time, you also put mouse food in a green box; it looks and smells the same as the red box, but is glued shut so the mouse can't get the food out. At the end of the two weeks, you put the two boxes in with the mouse for 10 more times. The mouse pushes over the red box first eight times and the green box two times. Which test should you use?
11. You have been observing a large troop of monkeys in the Philadelphia zoo. Some of the monkeys were born there, and the other monkeys were brought there from other zoos. By careful observation of their social interactions, you have put the monkeys in order from most dominant to least dominant: which monkey is dominant over all, which monkey submits only to the most dominant, etc., all the way down to the poor monkey that submits to every other monkey. You want to know whether monkeys born at the Philadelphia zoo tend to be more dominant compared with monkeys brought from other zoos. Which test should you use?
12. You want to know whether aspirin taken during pregnancy has an effect on the sex of offspring. You ask 1072 new mothers whether they took aspirin during the first three months of their pregnancy, and you also ask them whether they had a boy or a girl. Which test should you use?