Greetings, I have "inherited" a cDNA macroarray dataset that is structured as follows. Three different stressors were tested. For each stressor, there are two treatments (control and stressed). For each treatment, two biological replicates exist, and these are paired (i.e., there is a stressed array for colony A and a control array from this same colony). For one of these samples, duplicate arrays were performed (technical replicates). This works out to 18 different arrays corresponding to 12 independant biological replicates. But counting only the biological replicates for each stressor, there are only n=2 stressed arrays and n=2 control arrays.
I am pretty well versed in the analysis of array data using R, but obviously this dataset presents a real challenge because of the low replication. For logistical reasons, increasing the sample size is not a possibility. My main goal here is to salvage whatever valid findings can be salvaged from the existing data, but I dont want to go too far in claiming significance for an expression pattern if there isnt really anystatistical support for it. My questions are: (1) Whether it is even possible to statistically compare the effects of these stressors on gene expression, (2) If so, what are folks' recomendations? (3) Obviously low sample size means low statistical power, but I have always been told that calculating variance for n=2 and doing stats on that basis is not even mathematically valid. Can anyone confirm or refute this? Thank you for any advice you may have to offer, -- Eli Meyer Postdoctoral Fellow Department of Integrative Biology University of Texas at Austin Austin, TX 78712 office: (512) 475-6424 cell: (310) 618-4483 [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.