[R] permutation approach to know effect of one factor (factor A) to another (factor B) in each level of factor A (although it is slow)

Kazunari Nozue Fri, 24 Feb 2012 10:26:58 -0800

Hi all,
 
At first I will explain my linear mixed effects model;
lme1 <- lme(leaf_length ~ 
Treatment*Genotype*leaf,random=~Treatment|Set,data=data)


or
lmer1 <- lmer(leaf~Treatment*Genotype*leaf+(Treatment|Set),data=data)

 
Treatment factor has two conditions: mock (=Mo) and treatment(=Tr)
Leaf is position of leaf: 3rd leaf, 4th leaf, ....
Genotype is type of plant: wt, mut1, mut2, ....., mut72
Set is experimental replicates: A to J.
I would like to ask how to compare leaf length under condition A and one under 
condition B in each Genotype, i.e. know effect of one factor (factor A, 
“Genotype”) to another (factor B, “Treatment’) in each level of factor A (eg. 
“wt”, “mut1”, “mut2”, ...).
Since my data is unbalanced data, I could not use TukeyHSD() multiple 
comparison (see its help).
My understanding is that pvalue in summary(lme1)$tTable is controversial and 
pvals.fnc(lmer1) # in languageR package
gave me an error (due to (Treatment|Set) because (1|Set) did not gave me an 
error, but (Treatment\SET) is essential in this model.).

As seen below, I would like to know my permutation method is OK or not.
 
I started from a simple permutation approach from Maindonald and Braun (2003) 
"Data Analysis and Graphics Using R." pg 98. (see scripts below) and I combined 
my mixed model and this approach with simulated data (omitting leaf factor to 
simplify, see scripts below). Disadvantage of this method is running time could 
be long in large data sets with many permutation (eg. 10000), I would like to 
know my method is OK. If I could have better methods (probably by using 
multcomp package, such as glht() function), I would really appreciate them.

Sorry for long message.
 
Thank you,
 
Kazu
##############################
### Maindonald nad Braun pg. 98
##############################
library(DAAG)
data(two65) # from DAAG package
x1 <- two65$ambient;x2<-two65$heated;x<-c(x1,x2)
n1<-length(x1);n2<-length(x2);n<-n1+n2
dbar<-mean(x2) - mean(x1)
z<-array(,2000)

for(i in 1:2000) {
mn<-sample(n,n2,replace=FALSE)
dbardash<-mean(x[mn])-mean(x[-mn])
z[i]<-dbardash
}

pval<-(sum(z > abs(dbar)) + sum(z< -abs(dbar)))/2000
plot(density(z),yaxs="i")
abline(v=dbar)
abline(v=-dbar,lty=2)
###############################
### my example with unbalanced data
###############################
library(lme4)
#simulate data. leaf length in Tr is longer than Mo. wt shows more dramatic 
response to Tr than mut. setA plants were longer than setB plants (set factor 
is significant in this model).
set.seed(1234)
data <- 
data.frame(Treatment=rep(c("Tr","Mo"),c(9,11)),leaf=c(rnorm(9,11),rnorm(11,10)),Set=rep(c("A","B"),times=10))
data$Genotype <- factor(rep(c("mut","wt"),each=5,length.out=20))
#add setA specific effect for shade and then for sun
data$leaf[data$Treatment=="Tr" & data$Set=="A"] <- 
data$leaf[data$Treatment=="Tr" & data$Set=="A"] + 
rnorm(length(data$leaf[data$Treatment=="Tr" & data$Set=="A"]),1)
data$leaf[data$Treatment=="Tr" & data$Genotype=="mut"] <- 
data$leaf[data$Treatment=="Tr" & data$Genotype=="mut"] + 
rnorm(length(data$leaf[data$Treatment=="Tr" & data$Genotype=="mut"]),-0.5)
data$leaf[data$Treatment=="Mo" & data$Set=="A"] <- 
data$leaf[data$Treatment=="Mo" & data$Set=="A"] + 
rnorm(length(data$leaf[data$Treatment=="Mo" & data$Set=="A"]),-0.25)
data
# mean from observed data
mean.table.obs<-tapply(data$leaf, list(data$Treatment,data$Genotype),mean)
# permutation
mean.table.PER<-list()
nreps<-1000
z<-list() # mean differences

for(i in 1:nreps) {
        new.data<-data.frame()
        
new.wt<-sample(data[data$Genotype=="wt",]$leaf,sum(data$Genotype=="wt",na.rm=TRUE),replace=FALSE)
 # null hypothesis: leaf in Mo = Tr in wt
        
new.mut<-sample(data[data$Genotype=="mut",]$leaf,sum(data$Genotype=="mut",na.rm=TRUE),replace=FALSE)
        
new.data<-data.frame(leaf=c(new.wt,new.mut),Genotype=data$Genotype,Treatment=data$Treatment,Set=data$Set)
        # mixed effect model
        lmer.temp<- lmer(leaf~Treatment*Genotype+(Treatment|Set),data=new.data)
        #calculate mean of each group
        mean.table.PER[[i]]<-tapply(fitted(lmer.temp), 
list(new.data$Treatment,new.data$Genotype),mean)
        z[[i]]<-mean.table.PER[[i]][2,] - mean.table.PER[[i]][1,]
}

# calculate p value
TF1<-list()
TF2<-list()
p.value<-vector()

for(i in 1:length(levels(data$Genotype))) {
        for(n in 1:length(z)) {
             TF1[[n]]<-          (z[[n]][i] > abs(mean.table.obs[2,i] - 
mean.table.obs[1,i]))*1
             TF2[[n]]<-      (z[[n]][i] < -abs(mean.table.obs[2,i] - 
mean.table.obs[1,i]))*1
          }    
        p.value[i]<-(sum(as.numeric(TF1) + as.numeric(TF2)))/length(z)
}

names(p.value)<-levels(data$Genotype)
p.value
p.adjust<-p.adjust(p=p.value,method=”fdr”)
p.adjust
# graph
par(mfcol=c(2,1))
hist(unlist(z)[names(unlist(z))=="mut"],breaks=seq(-5,5,0.2))
abline(v=abs(mean.table.obs[2,1] - mean.table.obs[1,1]))
abline(v=-abs(mean.table.obs[2,1] - mean.table.obs[1,1]))

hist(unlist(z)[names(unlist(z))=="wt"],breaks=seq(-5,5,0.2))
abline(v=abs(mean.table.obs[2,2] - mean.table.obs[1,2]))
abline(v=-abs(mean.table.obs[2,2] - mean.table.obs[1,2]))
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] permutation approach to know effect of one factor (factor A) to another (factor B) in each level of factor A (although it is slow)

Reply via email to