Biological Data Analysis: Homework 9

Due Thursday, Nov. 29



You must type this and all other homework assignments. Do not e-mail the assignment to me; turn it in early (at 322 Wolf) for a foreseeable absence, or turn it in late after an unexpected absence from class.

1. Arnbom et al. (1994) weighed female southern elephant seals, Mirounga leonina, and recorded the sex of their offspring. The table below shows the weight (rounded to the nearest 20 kg), the number of male pups of mothers in that weight class, and the number of female pups of mothers in that weight class. Perform a logistic regression using SAS, interpret the results, and graph the proportion of males vs. mother's weight. You must print your SAS program and the SAS output.


weight  males  females
300       0       3
340       0       1
360       0       1
380       1       8
400       4       2
420       5       7
440       5       5
460       7       5
480       5       7
500       5       8
520       3       4
540       4       4
560       2       3
580       4       4
600       3       2
620       6       5
640       2       4
660       3       2
680       1       0
700       3       1
720       2       0
740       2       0
760       0       1
780       2       0
800       0       1
820       1       2
980       1       0

2. Download this spreadsheet. Hit the "F9" key (on a Windows computer) or "command" and "=" keys (on a Mac) once to make it calculate a new set of random data for you. Added instruction: Copy columns F through P (with the simulated data in them) and paste to the same location, using the "Paste Special..." command and choosing "Values." This will keep the numbers from changing each time you do something. The spreadsheet simulates an experiment in which you have measured the expression level of 200 genes in five cancer patients and five healthy people. You are trying to find genes that have different expression levels between the two groups.

For each gene, calculate the mean for cancer patients, the mean for healthy people, and the P-value for a t-test comparing the two means. (Note--please enter these formulas in line 3, then copy and paste them into the remaining lines--do not retype them on each line!) Look through the P-values. How many genes have a P-value less than 0.05? If you didn't know anything about multiple comparisons, what would you conclude? (Do not print the whole spreadsheet, please--just describe your results.)

Consider the P-values using the Bonferroni correction for the significance level. What is your new significance level? Now how many genes are significant? What is your new conclusion?

Finally, analyze the P-values using the Benjamini-Hochberg procedure, with a false discovery rate of 10 percent. Now how many genes are significant? Now what do you conclude?

References

Arnbom, T., M.A. Fedak, and P. Rothery. Offspring sex ratio in relation to female size in southern elephant seals, Mirounga leonina. Behav. Ecol. Sociobiol. 35: 373-378.



Return to the Biological Data Analysis syllabus

Return to John McDonald's home page

This page was last revised November 27, 2012. Its URL is http://udel.edu/~mcdonald/stathw9.html