I'm a programmer in a biology lab who is starting to use R to automate some of our statistical analysis of growth rate determination. But I'm running into some problems as I re-code.
1) Hypotheses concerning Slope similarity/difference: I'm using R's anova(lm()) methods to analyse a model which looks like this: growth.metric ~ time * test.tube I understand that testing the the interaction between time and tube (time:test.tube) will tell us if the growth rates (for the last three test tubes) are significantly different from one another (Ho=slopes are the same). The purpose of doing this test is so that we can be certain our cultures have fully acclimated to the treatment and aren't going to change much if we stop measuring. This is an important cost saving practice too as measurements can go on for years. Yet I'm worried that our null and alternative hypotheses should be swapped so that our test is more conservative (Ho=slopes are different ... ie still acclimating.) Is there a way to specify my model that flips these hypotheses? Should I be using a different method? Is this even appropriate? 2) Growth Rate is confounded with Variance of Growth Rate I'm also worried about the fact that rates for cultures with faster growth are calculated using fewer data points (assuming similar sampling times between treatments) . The result is that growth ~ var (growth). Not only does this put a wrinkle in my analysis between treatments, but it also biases the growth acclimation determining ANCOVA test above. Faster growing cultures will usually pass the "no significant difference between slopes test" more easily because there are fewer points from which to be certain about rejecting Ho. Is there a way to control for this? Perhaps I could include the number of points in my model? 3) Statistical validity of using subsets of growth.metric measurements within a test tube There are some lab members who insist that we can throw out the beginning and end of our log transformed growth.metric measurements because they are outliers in determining maximum growth. I've proposed looping through all possible combinations of 3 or more points within the growth curve and using the highest or best fitting (best R- squared) slope. But this idea has been rejected by our PI as not be a valid thing to do. Ideas here? Thank you. Dave ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.