Hi exactly what is
fortune("surgery") about. Anyway, you can save yourself a lot headache, if you start using lists for your objects. Lists can be used easily in cycles. for (i in 1:n) { some.list[i] <- some.function(some.other.list[i]) } and also lapply/sapply functions can be useful sapply(sp1.loc1,scale) will give you scaled data frame Regards Petr > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- > project.org] On Behalf Of Gustavo Vieira > Sent: Thursday, February 28, 2013 10:53 AM > To: r-help@r-project.org > Subject: [R] help for an R automated procedures > > > Dear, I would like to post the following question to the r-help on > Nabble (thanks in advance for the attention, Gustavo Vieira): > Hi there. > I have a data set on hands with 5,220 cases and I'd like to automate > some procedures (but I have almost no programming knowledge). The data > has some continuous variables that are grouped by 2 others: the name of > species and the locality where they were collected. So, the samples are > defined as 'each species on each locality'. For every sample I'd like > to do multiple imputation (when applicable), test for the presence of > outliers, standardize the variables, correct some species abundances, > save individual samples to tab delimited text file, and assemble each > individual sample (now, without NAs and outliers, corrected abundances, > and with the new standardized > variables) into a single data set. That task is pretty complex to me, > since my programming knowledge is poor (and my free time to learn R > programming is sparse). Could someone help me with that (I could > provide you the data set and the script I have written to do that, > sample by sample [ouch!])? > Thanks in advance for your attention and all the best > (g...@hotmail.com). > > [Bellow is an example is the codes I've used to accomplish my goals, > sample by sample, which can exemplify the complexity of the procedures: > > #Subsetting the data (v1-v11 are continuous "predictors"): species 1 at > locality 1 (all data [5520 cases] are on a vector called 'morfo') > sp1.loc1<-morfo[which(spps=="sp1" & taxoc=="loc1"),] #getting only the > observations of sp1 (species 1) at loc1 (locality 1) > str(sp1.loc1) #abundance -> 19 cases and the abundance variable > ('abund') says 18... > sp1.loc1$abund<-rep(19,19) > summary(sp1.loc1) #missing values present; abundance for sp1 at loc1 > corrected > attach(sp1.loc1) > > #Dealing with NAs: > install.packages("mice", dependencies = T) #ok (R at: home & work) > library(mice) > imp <- mice(sp1.loc1) > sp1.loc1 <- complete(imp) > summary(sp1.loc1) #jaust checking... No more Nas! > attach(sp1.loc1) > > > #Detecting univariate outliers > z.crit <- qnorm(0.9999) > > subset(sp1.loc1, select = id, subset = abs(scale(v1)) > z.crit) > > subset(sp1.loc1, select = id, subset = abs(scale(v2)) > z.crit) > morfo[47,6] > sort(v2[taxoc=="loc1"]) #the nearest observation close to 32.00 is > 25.10 sp1.loc1[,6][sp1.loc1[,6]==32.00]<-25.10 > subset(sp1.loc1, select = id, subset = abs(scale(v2)) > z.crit) > #Rechecking for outliers (now, it's ok) > > subset(sp1.loc1, select = id, subset = abs(scale(v3)) > z.crit) > > subset(sp1.loc1, select = id, subset = abs(scale(v4)) > z.crit) > > subset(sp1.loc1, select = id, subset = abs(scale(v5)) > z.crit) > > subset(sp1.loc1, select = id, subset = abs(scale(v6)) > z.crit) > > subset(sp1.loc1, select = id, subset = abs(scale(v7)) > z.crit) > > subset(sp1.loc1, select = id, subset = abs(scale(v8)) > z.crit) > > subset(sp1.loc1, select = id, subset = abs(scale(v9)) > z.crit) > > subset(sp1.loc1, select = id, subset = abs(scale(v10)) > z.crit) > > subset(sp1.loc1, select = id, subset = abs(scale(v11)) > z.crit) > > #Standardizing variables > v1.std<-with(sp1.loc1,(scale(v1))) > v1.pad<-v1.std[,1] > > v2.std<-with(sp1.loc1,(scale(v2))) > v2.pad<-v2.std[,1] > > v3.std<-with(sp1.loc1,(scale(v3))) > v3.pad<-v3.std[,1] > > v4.std<-with(sp1.loc1,(scale(v4))) > v4.pad<-v4.std[,1] > > v5.std<-with(sp1.loc1,(scale(v5))) > v5.pad<-v5.std[,1] > > v6.std<-with(sp1.loc1,(scale(v6))) > v6.pad<-v6.std[,1] > > v7.std<-with(sp1.loc1,(scale(v7))) > v7.pad<-v7.std[,1] > > v8.std<-with(sp1.loc1,(scale(v8))) > v8.pad<-v8.std[,1] > > v9.std<-with(sp1.loc1,(scale(v9))) > v9.pad<-v9.std[,1] > > v10.std<-with(sp1.loc1,(scale(v10))) > v10.pad<-v10.std[,1] > > v11.std<-with(sp1.loc1,(scale(v11))) > v11.pad<-v1.std[,1] > > > #Joining the new standardized variables to the sp1.loc1 data set > > sp1.loc1<- > data.frame(sp1.loc1,v1.pad,v2.pad,v3.pad,v4.pad,v5.pad,v6.pad,v7.pad,v8 > .pad,v9.pad,v10.pad,v11.pad) > > attach(sp1.loc1) > > write.table(sp1.loc1,"sp1.at.loc1.txt",quote=F,row.names=F, > col.names=T,sep="\t") > > detach(sp1.loc1) > > #Subsetting the data (v1-v11 are continuous "predictors"): species 2 at > locality 1...]-- > > "Time will tell" > -- > > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.