This is the first part of the study guide for the final exam in Biological Data Analysis, fall 2014. There are also three sets of practice questions: practice exam 1, practice exam 2, and practice exam 3. I recommend that you spend some time studying, then try to answer the practice questions under test conditions (no book or notes, timed, in a room full of people who are eerily quiet).

The final will be on Thursday, Dec. 11, from 10:30 to 12:30 p.m. in 217 Gore Hall (the usual classroom). If you can't take the exam at that time, let me know as soon as possible so we can set up an alternate time. Having other exams on the same day is NOT a legitimate reason for changing the day that you take the exam.

You may not use your notes or textbook during the exam; if English is your second language, you may use a dictionary. You will not need a calculator.

The exam will consist of 36 questions, each worth 1.25 points. About 30 will be of the format you've seen on the previous exams: I will describe a data set, and you will say what the best statistical test to use would be. Your answers must be specific. For chi-squared or G-tests, you must specify whether it is a goodness-of-fit test or test of independence. For t-tests, you must specify one-sample or two-sample. For anovas, you must specify one-way, two-way, or nested. **If the answer is a two-way anova, you must specify with or without replication**. For regression, you must specify linear regression, polynomial regression, multiple regression, simple logistic regression, or multiple logistic regression. If there are two equally appropriate tests (such as G-test or chi-squared test, t-test or one-way anova), you must only put one down. For the purposes of this exam, "correlation" and "linear regression" are considered equivalent; you may write down one or the other, or write "correlation/linear regression."

Unless something in the question makes it clear that an assumption is violated, you should assume that all data meet the parametric assumptions (normality and homoscedasticity) and that all correlation/regressions are linear and independent.

About 6 questions will be on other material. You should know the assumptions of the different tests, how you tell whether those assumptions are met, and what to do if they're not met. You should be familiar with the different descriptive statistics, what they mean and what they're useful for. You should understand data transformation, multiple comparisons, and meta-analysis. You should be able to interpret the results of the different tests; for example, you should be able to explain what a significant interaction term means in a two-way anova.

