You must type this and all other homework assignments. Do not e-mail the assignment to me; turn it in early (at 322 Wolf) for a foreseeable absence, or turn it in late after an unexpected absence from class.
1. Collect some data that could be analyzed with correlation and regression. This could be published data; data you found on the web; data from your own research; or something you measure for this assignment. Don't just make up some numbers, it must be real data. The data must be biological, involving living or dead organisms in some way. You must have at least 20 observations.
Write a sentence or two explaining what the data set is and where you got it from. Then say what biological question the data could be used to answer. Turn in the raw data. Look at a graph of the data without any trendlines (you don't need to turn this one in) and consider your biological question, then say which method you think would be most appropriate: linear regression/correlation of the untransformed data, linear regression/correlation of the transformed data, polynomial regression, or Spearman's rank correlation.
Analyze the data using the technique you think is best, and interpret the result. Use both a spreadsheet and SAS. You must show your SAS program and the output from SAS (the .lst file). If you don't get an output from SAS, or if it is obviously wrong, you must show your .log file instead. You don't need to show the .log file if the SAS output shows the same results as the spreadsheet.
Then, just for practice, analyze the data using the other possible techniques (for the regression using transformed data, use the log-transformation unless you have a reason to use something else). Give the equations of the regression lines, the r2, degrees of freedom, and P-value for each technique (except Spearman's rank correlation, for which you won't give an equation of a line).
Print a graph showing the regression line for the technique you think is most appropriate. If you thought Spearman's rank correlation was most appropriate, print the graph for linear regression instead. Compare the results of the four techniques--do you still think your original choice was the best way to analyze your data?
Return to the Biological Data Analysis syllabus
Return to John McDonald's home page
This page was last revised November 7, 2014. Its URL is http://udel.edu/~mcdonald/stathw9.html