Here are the answers to exam 1. For some of the questions, I have provided explanatory material in regular type, and the answer in **bold**; all you need to write down is the answer. If you don't understand why your answer was wrong, you may e-mail me, talk to me before or after class, or set up a time to talk to me in my office. The exam was worth 25 points, so each question was worth 1.67 points.

**1. ** **standard deviation would show how spread out the individual observations were;
standard error would show how accurate each estimate of the mean is**.

**2. ** One nominal variable, drug vs. placebo, with two values; one measurment variable, minutes of snoring: **two-sample t-test** or **one-way anova**.

**3. ** One nominal variable, kind of corn, with more than two values; one measurement variable, number of larvae per stalk: **one-way anova **.

**4. ** One nominal variable, what happened to the turtle, with more than two values (eaten by bird, eaten by fox, eaten by raccoon, didn't get eaten); one measurement variable, time when turtle pokes head above sand: **one-way anova**.

**5. ** Three nominal variables, morning vs. evening, standing on one leg vs. two legs, Wednesday vs. Thursday vs. Friday: **Cochran-Mantel-Haenszel test**.

**6. ** One nominal variable, the robin finds a worm vs. doesn't find a worm; theoretical expectation if null is true (should find worm 7/20 of the time); total sample size less than 1000: **exact test of goodness-of-fit**.

**7. ** One nominal variable, the isopod goes in the green or the blue side; theoretical expectation if null is true (half should go in green side); total sample size less than 1000: **exact test of goodness-of-fit**.

**8. ** Two nominal variables: habitat (eelgrass vs. oyster vs. mud vs. sand), juvenile vs. adult; total sample size less than 1000: **Fisher's exact test**.

**9. ** One nominal variable with more than two values, species of firefly; one measurement variable, flashes per minute; experiment is unbalanced (sample sizes range from 5 to 58) and data are heteroscedastic ("there is much more variation among individuals for some species than for others"): **Welch's anova**

**10. ** One nominal variable, which corn field it is; one measurement variable, phosphate concentration; ** one-way anova**.

**11. ** One nominal variable, predator (coyote, badger, or roadrunner; one ranked variable, most intense to least intense: **Kruskal-Wallis test**.

**12. ** Two nominal variables, type of fungicide, mustard plant sprouted or didn't; total sample size less than 1000: **Fisher's exact test**. Because you know that you planted 50 seeds in each plot, "the number of mustard plant in each plot" gives you a nominal variable; if you plant 50 seeds and see 36 plants two weeks later, you know that you had 36 plants that sprouted and 14 that didn't.

**13. **** After your significant one-way anova, you should do the Tukey-Kramer test, to see which pairs of cat types are significantly different in purring time**. Note that you should have checked the assumptions of normality and homoscedasticity *before* doing the one-way anova, so you got points off for saying that should come next. And you should graph your data and publish the results *after* you've done all the statistics.

**14. ****In 50 experiments testing null hypotheses that are all really true, you will expect about 2.5 experiments to give you significant (P<0.05) results. If you are gullible, you will conclude that no-touch reiki (which really is a thing; Google it) works on a few diseases. If you are smart, you will conclude that a few significant P-values are false positives and require a much smaller P-value to convince you that no-touch reiki works. **

When I said "based on the results of your statistical tests," I expected you to know that you'd get a couple of P-values less than 0.05, just by chance. So you got points off if you said you'd conclude that no-touch reiki doesn't work because none of the results would be significant. Sample size has nothing to do with it; you'll get P<0.05 about 5 percent of the time, no matter what the sample size, so you got points off for saying that the sample sizes were too small. And you got points off for saying you should do Tukey-Kramer, which isn't even appropriate for this experiment (you've done 50 statistical tests, not one big one-way anova).

**15. ** **alpha (significance level): get this from the literature in your field (almost always 0.05)
power or beta: get this from the literature in your field, or just pick a nice round number
effect size: pick the minimum effect size you hope to detect based on the effect sizes other people have found for similar experiments, or on some criterion for what would be an uninterestingly small effect, or out of your butt
standard deviation: either from prior literature or a pilot experiment**