Hi, Thanks for the help. What I actually ended up doing was writing a copy of for loops and I ended up getting something works. Thanks. Chris
On Fri, Aug 16, 2013 at 4:34 PM, arun <smartpink...@yahoo.com> wrote: > Hi, > May be this helps: > #data1 (changed `data` to `data1`) > set.seed(6245) > data1 <- data.frame(x1=rnorm(5),x2=rnorm(5),x3=rnorm(5),x4=rnorm(5)) > data1<- round(data1,digits=3) > > data2<- data1 > > data1[,3:4]<-lapply(data1[,3:4],function(x){x1<- > match(x,sample(unlist(data1[,3:4]),round(0.8*length(unlist(data1[,3:4])))));x[ > is.na(x1)]<-NA;x}) > data1 > # x1 x2 x3 x4 > #1 0.482 1.320 NA -0.142 > #2 -0.753 -0.041 -0.063 0.886 > #3 0.028 -0.256 -0.069 0.354 > #4 -0.086 0.475 0.244 0.781 > #5 0.690 -0.181 1.274 1.633 > > > #or > data2[,3:4]<-lapply(data2[,3:4],function(x){x1<- > match(x,sample(unlist(data2[,3:4]),round(0.8*length(unlist(data2[,3:4])))));x[ > is.na(x1)]<-NA;x}) > data2 > # x1 x2 x3 x4 > #1 0.482 1.320 -0.859 -0.142 > #2 -0.753 -0.041 NA NA > #3 0.028 -0.256 -0.069 0.354 > #4 -0.086 0.475 0.244 0.781 > #5 0.690 -0.181 1.274 1.633 > A.K. > > > > ----- Original Message ----- > From: Christopher Desjardins <cddesjard...@gmail.com> > To: "r-help@r-project.org" <r-help@r-project.org> > Cc: > Sent: Friday, August 16, 2013 3:02 PM > Subject: [R] Randomly drop a percent of data from a data.frame > > Hi, > I have the following data. > > > set.seed(6245) > > data <- data.frame(x1=rnorm(5),x2=rnorm(5),x3=rnorm(5),x4=rnorm(5)) > > round(data,digits=3) > x1 x2 x3 x4 > 1 0.482 1.320 -0.859 -0.142 > 2 -0.753 -0.041 -0.063 0.886 > 3 0.028 -0.256 -0.069 0.354 > 4 -0.086 0.475 0.244 0.781 > 5 0.690 -0.181 1.274 1.633 > > What I would like to do is drop 20% of the data. But I want this 20% to > only come from dropping data from x3 and x4. It doesn't have to be evenly, > i.e. I don't care to drop 2 from x3 and 2 from x4 or make sure only one > observation has missing data on only one variable. I just want to drop 20% > of the data through x3 and x4 only. In other words, > > x1 x2 x3 x4 > 1 0.482 1.320 -0.859 NA > 2 -0.753 -0.041 -0.063 0.886 > 3 0.028 -0.256 NA 0.354 > 4 -0.086 0.475 NA 0.781 > 5 0.690 -0.181 NA 1.633 > > OR > > x1 x2 x3 x4 > 1 0.482 1.320 NA -0.142 > 2 -0.753 -0.041 -0.063 0.886 > 3 0.028 -0.256 NA NA > 4 -0.086 0.475 0.244 NA > 5 0.690 -0.181 1.274 1.633 > > OR > > x1 x2 x3 x4 > 1 0.482 1.320 -0.859 -0.142 > 2 -0.753 -0.041 -0.063 NA > 3 0.028 -0.256 -0.069 NA > 4 -0.086 0.475 0.244 NA > 5 0.690 -0.181 1.274 NA > > ETC. are all fine. > > Any ideas how I can do this? > Chris > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.