Dear All,
I am quite new to R and am having a problem trying to run a linear model
with random effects/ a regression- with particular regard to my variable
lengths being different and the models refusing to compute any further.
The codes I have been using are as follows:
vc<-read.table("P:\\R\\Testvcomp10.txt",header=T)
attach(vc)
family<-factor(family)
colms<-(vc)[,4:13] ## this to assign the 10 columns containing marker
data to a new variable, as column names are themselves not in any
recognisable sequence
vcdf<-data.frame(family,peg.no,ec.length,syll.length,colms)
library(lme4)
for (c in levels(family))
+ { for (i in 1:length(colms))
+ { fit<-lmer(peg.no~1 + (1|c/i), vcdf)
+ }
+ summ<-summary(fit)
+ av<-anova(fit)
+ print(summ)
+ print(av)
+ }
This gives me:
Error in model.frame.default(data = vcdf, formula = peg.no ~ 1 + (1 + :
variable lengths differ (found for 'c')
I had posted a similar message on the R mixed model list a few days ago,
with respect to my fundamental methods, and Jerome Goudet had kindly
referred me to an alternative approach using residuals obtained from a
random effects model in lmer(), and then doing regressions using those
[residuals being the dependent variable and my marker data columns the
independent variable].
The code for that is as follows:
vc<-read.table("P:\\R\\Text
Files\\Testvcomp10.txt",header=T,sep="",dec=".",na.strings=NA,strip.white=T)
attach(vc)
family<-factor(family)
colms<-(vc)[,4:13]
names(vc)
[1] "male.parent" "family" "offspring.id" "P1L55" "P1L73"
[6] "P1L74" "P1L77" "P1L91" "P1L96" "P1L98"
[11] "P1L100" "P1L114" "P1L118" "peg.no"
"ec.length"
[16] "syll.length"
vcdf<-data.frame(family, colms, peg.no, ec.length, syll.length)
library(lme4)
famfit<-lmer(peg.no~1 + (1|family), na.action=na.omit, vcdf)
resfam<-residuals(famfit)
for( i in 1:length(colms))
+ {
+ print ("Marker", i)
+ regfam<-abline(lm(resfam~i))
+ print(regfam)
+ }
This again gives me the error:
[1] "Marker"
Error in model.frame.default(formula = resfam ~ i, drop.unused.levels =
TRUE) :
variable lengths differ (found for 'i')
My variables all have missing values somewhere or the other. The missing
values are not consistent for all individuals, i.e some individuals have
some values missing, others have others,
and as much as I have tried to use na.action=na.omit and na.rm=T, the
differing variable length problem is dogging me persistently..
I also tried to isolate the residuals, save them in a new variable (called
'resfam' here), and tried to save that in the data.frame()->vcdf, that I
had created earlier.
The problem with that was that when the residuals were computed, lmer()
ignored missing data in 'peg.no' with respect to 'family', which is
obviously not the same data missing for say another variable E.g.
'ec.length'- leading again to an inconsistency in variable lengths.
Data.frame would then not accept that addition at all to the previous set.
This was fairly obvious right from the start, but I decided to try it
anyway. Didn't work.
I apologise if the solution to working with these different variable
lengths is obvious and I don't know it- but I don't know R that well at all.
My data files can be downloaded at the following location:
<http://www.filesanywhere.com/fs/v.aspx?v=896d6b88616173be71ab> (excel-
.xlsx)
<http://www.filesanywhere.com/fs/v.aspx?v=896d6b88616174a76e9e>
(.txt file)
Any pointers would be greatly appreciated, as this is holding me up loads.
Thanks a ton for your help,
Aditi
----------------------
A Singh
aditi.si...@bristol.ac.uk
School of Biological Sciences
University of Bristol
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.