HI, May be this helps: dat1<-read.table(text=" V1 V2 1 5 10 2 6 3 3 8 4 4 9 20 5 15 30 6 25 40 7 2 4 8 3 1 9 1 5 10 8 10 ",header=TRUE) dat2<-dat1[sample(NROW(dat1),NROW(dat1)*(1-0.3)),] #70% of data dat2$newcol<-TRUE dat1$newcol1<-TRUE dat4<-merge(dat1,dat2,by=c("V1","V2"),all=TRUE) dat5<-dat4[is.na(dat4$newcol),][,1:2] #remaining 30% dat5 # V1 V2 #2 2 4 #4 5 10 #8 9 20 A.K.
----- Original Message ----- From: Eddie Smith <eddie...@gmail.com> To: r-help@r-project.org Cc: Sent: Monday, November 19, 2012 12:16 PM Subject: [R] How to subset my data and at the same time keep the balance? Hi guys, I have 1000 rows of a dataset. In my analysis, I need 70% of the data, run my analysis and then use the remaining 30% to test my model. Could anybody kindly help me on this? Cheers ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.