Biological Data Analysis: Homework 7

Due Thursday, May 10

You must type this and all other homework assignments. Do not e-mail the assignment to me; turn it in early (at 322 Wolf) for a foreseeable absence, or turn it in late after an unexpected absence from class.

1. Write three exam questions, in the same style as on the study guides and the first two exams. The questions must describe an experiment, then ask "Which test should you use?" The answers to your first two questions are given below, next to your birthday; for example, if you were born on the 6th of a month, you must write one question whose answer is "exact test of goodness-of-fit" and one whose answer is "paired t-test." Your third question should have your very favorite statistical test as the answer. Be sure to include the answer to each question. If you're writing about your area of research, be sure to give enough detail that all of your classmates would understand the question.

Birthdate  Answers
      1   Simple logistic regression; Kruskal-Wallis test
      2   Kruskal-Wallis test; Chi-square test or G-test of goodness-of-fit
      3   Chi-square test or G-test of goodness-of-fit; Fisher's or Welch's one-way anova
      4   Fisher's or Welch's one-way anova; Linear regression/correlation
      5   Linear regression/correlation; Exact test of goodness-of-fit
      6   Exact test of goodness-of-fit; Welch's one-way anova
      7   Welch's one-way anova; Chi-square test or G-test of independence
      8   Chi-square test or G-test of independence; Spearman rank correlation
      9   Spearman rank correlation; Nested anova
     10   Nested anova; Linear regression/correlation
     11   Linear regression/correlation; Ancova
     12   Ancova; Welch's one-way anova
     13   Welch's one-way anova; Fisher's or Welch's one-way anova
     14   Fisher's or Welch's one-way anova; Multiple regression
     15   Kruskal-Wallis test; Fisher's exact test
     16   Fisher's exact test; Two-way anova with replication
     17   Two-way anova with replication; Exact test of goodness-of-fit
     18   Exact test of goodness-of-fit; Nested anova
     19   Nested anova; Spearman rank correlation
     20   Spearman rank correlation; Ancova
     21   Ancova; Two-way anova without replication
     22   Two-way anova without replication; Simple logistic regression
     23   Simple logistic regression; Chi-square test or G-test of goodness-of-fit
     24   Chi-square test or G-test of goodness-of-fit; Mulitple logistic regression
     25   Multiple logistic regression; Fisher's or Welch's one-way anova
     26   Fisher's or Welch's one-way anova; Fisher's exact test
     27   Fisher's exact test; Chi-square test or G-test of independence
     28   Chi-square test or G-test of independence; Two-way anova with replication
     29   Two-way anova with replication; Nested anova
     30   Nested anova; Two-way anova without replication
     31   Two-way anova without replication; Simple logistic regression

2. Find a recent paper in your field of interest. If you are a grad student or an undergrad doing research in someone's lab, it should be a paper from your lab; if you're not in a lab, it can be anything that interests you. It can be one of the papers you used for a previous homework, but it doesn't have to be. The paper must describe original experiments and results; it cannot be a review paper or theoretical paper.

Identify one experiment that is described in the paper and is analyzed using one of the statistical tests we've talked about this semester. Give the citation information for the paper (authors, year, article title, journal title, volume, pages) in any format you like. Write a couple of sentences describing the experiment in terms your classmates would understand, and say what statistical test the authors used. Also say how the authors presented the results of the experiment: a table, a graph, a P-value given in the text, etc.

Here's an example of what I'm looking for:

McDonald, J.H. 2013. Geographic variation in Megalorchestia californiana allele frequencies may be caused by winter rather than summer temperatures. Marine Ecology Progress Series 488: 201-207.

McDonald (2013) counted allele frequencies for the glucose-6-phosphate isomerase (Gpi) gene in the crustacean Megalorchestia californiana at multiple locations on the Pacific coast of the United States. He used simple logistic regression with January mean temperature at each location as the measurement variable and allele frequency as the nominal variable. He presented the P-values in the text, the pseudo R2 values in a table, and a graph of allele frequency vs. temperature. The graph did not have any error bars.

Next, find four other papers that do the same kind of experiment. Each set of papers should have different authors, so you will get papers from five different labs (including the first one). The easiest way to find more papers with the same kind of experiment will be to look at the Introduction and Discussion sections of the first paper, and see who they cite as doing the same kind of experiment. For the example, the discussion section of McDonald (2013) says "There have been numerous biochemical studies of Gpi polymorphisms with allele frequencies associated with temperature (Watt 1977, Hoffmann 1981, Watt 1983, Hall 1985, Zera 1987, Van Beneden and Powers 1989, Patarnello and Battaglia 1992, Dahlhoff and Rank 2000)," so those references would be a good place to look for other experiments that compare allele frequency and temperature. Note that the experiments just have to be the same kind of experiment, not exactly the same. For my example, I could look for papers correlating allele frequencies at other genes besides Gpi, with other measurement variables besides temperature.

Give the citation information and a short description of the experiment and statistical test for each of the papers you've found. It's possible that some of these papers will use a statistical test that we haven't talked about in class; that's okay.

Finally, write a few sentences about what you've learned. Did everyone use the same statistical test for similar experiments? Based on what you've learned this semester, does it seem like the best test to use? If different authors used different tests, did they seem to have good reasons for choosing different tests, or were some authors using the wrong test? Did everyone present their results the same way?

Return to the Biological Data Analysis syllabus