You must type this and all other homework assignments. Do not e-mail the assignment to me; turn it in early (at 322 Wolf) for a foreseeable absence, or turn it in late after an unexpected absence from class.
For many of the remaining homeworks, you're going to use the data set the class collected. I have converted heights to inches and converted shoe sizes to centimeters. I have combined the shoe types into a small number of categories; I included "socks" in the "bare feet" category, for example. And I have combined the least-favorite high-school classes into a small number of categories (I included "music" in the "art" category, for example).
1. Download the balance data set. Pick two of the nominal variables with two values (sex, folded arm on top, clasped thumb on top, shoes vs. barefoot, hate math vs. hate something else). Analyze the two nominal variables using all three tests of independence. Report the raw data (the counts of each combination) and the P-value for each test. Write one sentence in which you compare the three P-values, and one sentence in which you summarize the results of this experiment.
2. Pick one of the nominal variables with more than two values ("Shoes worn during balance test" or "Most disliked class"), and one of the nominal variables with two values. Do either a chi-square test of independence or a G-test of independence on the two variables, whichever you think is best. Report the raw data (the counts of each combination) and the P-value. Write one sentence in which you summarize the results of the experiment.
3. Pick one of the measurement variables (length of first name, length of last name, age, height, shoe size, or balance time). Then pick one of the nominal variables with more than two values (Shoes worn during balance test, or most disliked class). For each value of the nominal variable you've chosen, calculate the mean, standard deviation, standard error, and number of observations, for the measurement variable you've chosen. For example, if you chose "length of last name" and "most disliked class," you'd calculate these summary statistics for everyone who disliked algebra; everyone who disliked biology; etc. Present the results in a nice table.
4. Draw a computer-generated graph with vertical bars showing the mean values of the measurement variable you chose in question 3, with a separate bar for each value of the nominal variable. Add "error bars" showing the plus-or-minus one standard deviation.
5. Draw the same graph as in question 4, except this one should have error bars showing plus-or-minus one standard error.
6. Look at the article you used in homework 1. If it has a graph in it with error bars, look at the figure caption. Do the error bars represent standard error; standard deviation; 95% confidence interval; something else; or does the figure legend not say what the error bars are? If the paper you used for homework 1 doesn't have a graph with error bars, look at other paper in the same journal until you find one. Just give me the name of the journal and what the error bars you found represent.
Return to the Biological Data Analysis syllabus
Return to John McDonald's home page
This page was last revised Sept. 23, 2016. Its URL is http://udel.edu/~mcdonald/stathw4.html