Don't know whats wrong there (except if you're using the eclipse R plugin on a mac like me and the window for choosing the download site doesn't pop up.. did it?^^)
Anyway, you could just split all of your data into 2 datasets, one that has all the data labeled 0, the other for all labeled 1, then take a random 80% of both, put them back together into the 80% data, and put the rest back together to form the 20%. Since i don't know your data, heres an example: M<-cbind(c(rep(0,10),rep(1,10)),1:20) > M [,1] [,2] [1,] 0 1 [2,] 0 2 [3,] 0 3 [4,] 0 4 [5,] 0 5 [6,] 0 6 [7,] 0 7 [8,] 0 8 [9,] 0 9 [10,] 0 10 [11,] 1 11 [12,] 1 12 [13,] 1 13 [14,] 1 14 [15,] 1 15 [16,] 1 16 [17,] 1 17 [18,] 1 18 [19,] 1 19 [20,] 1 20 index1<-which(M[,1]==1) > index1 [1] 11 12 13 14 15 16 17 18 19 20 > M1<-M[index1,] > M1 [,1] [,2] [1,] 1 11 [2,] 1 12 [3,] 1 13 [4,] 1 14 [5,] 1 15 [6,] 1 16 [7,] 1 17 [8,] 1 18 [9,] 1 19 [10,] 1 20 > M0<-M[-index1,] > M0 [,1] [,2] [1,] 0 1 [2,] 0 2 [3,] 0 3 [4,] 0 4 [5,] 0 5 [6,] 0 6 [7,] 0 7 [8,] 0 8 [9,] 0 9 [10,] 0 10 > s1<-sample(1:dim(M1)[1],0.8*dim(M1)[1]) > s1 [1] 10 3 5 9 2 6 8 7 > s0<-sample(1:dim(M0)[1],0.8*dim(M0)[1]) > s0 [1] 8 10 9 3 7 4 2 1 > data80<-rbind(M1[s1,],M0[s0,]) > data80 [,1] [,2] [1,] 1 20 [2,] 1 13 [3,] 1 15 [4,] 1 19 [5,] 1 12 [6,] 1 16 [7,] 1 18 [8,] 1 17 [9,] 0 8 [10,] 0 10 [11,] 0 9 [12,] 0 3 [13,] 0 7 [14,] 0 4 [15,] 0 2 [16,] 0 1 > data20<-rbind(M1[-s1,],M0[-s0,]) > data20 [,1] [,2] [1,] 1 11 [2,] 1 14 [3,] 0 5 [4,] 0 6 which is probably not how you really do things efficiently, but it should work. greetings Jessi Am 25.04.2012 um 17:18 schrieb Dwaipayan Dasgupta: > Thank you so much for replying. I tried what you said but it still throws the > same error i.e could not find function "sample.split" > Might be because of the version of R I am running (R version 2.12.2).i do not > have admin rights to upgrade to the newest version. > Is there anything else I can try? Im trying to split my data into 80:20 > keeping the ratio of 0,1 in the Y variable(binary) constant. > > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf Of Jessica Streicher > Sent: Wednesday, April 25, 2012 7:17 PM > To: r-help@r-project.org > Subject: Re: [R] Splitting data into test and train (80:20) kepping > attributes similar > > Well, it throws an error, because there is no such function in default R. A > bit of googling showed it might be the one in the caTools package. > > execute this: > install.packages("caTools") > library(caTools) > > before executing your code > > > Am 25.04.2012 um 12:39 schrieb Dwaipayan Dasgupta: > >> Hi, >> Could someone help me with this please , im trying to use >> Y = Attrition_data[,1] # extract labels from the data >> msk = sample.split (Y, SplitRatio=3/4) >> table(Y,msk) >> to do the splitting but it keeps throwing up and error >> Error: could not find function "sample.split" >> Could you please help >> >> Thanks in advance >> doy >> >> >> -----Original Message----- >> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On >> Behalf Of Dwaipayan Dasgupta >> Sent: Tuesday, April 24, 2012 9:08 PM >> To: r-help@r-project.org >> Subject: [R] Splitting data into test and train (80:20) kepping attributes >> similar >> >> Hi, >> I am trying to do some predictive modeling around attrition and want to >> split the dataset into test and train (80:20) and keep the ratio of >> attritees:non attrites same. >> In my dataset the attrition indicator is coded as 0(for non-attritees) and 1 >> (for attritees) and I want to keep the ratio of 0's to 1 similar. >> I apologize for this trivial question but this is my second week with R. >> >> Thanks, >> Doy >> >> >> >> >> >> American Express made the following annotations on Tue Apr 24 2012 08:38:50 >> >> ****************************************************************************** >> >> "This message and any attachments are solely for the intended recipient and >> may contain confidential or privileged information. If you are not the >> intended recipient, any disclosure, copying, use, or distribution of the >> information included in this message and any attachments is prohibited. If >> you have received this communication in error, please notify us by reply >> e-mail and immediately and permanently delete this message and any >> attachments. Thank you." >> >> American Express a ajouté le commentaire suivant le Tue Apr 24 2012 08:38:50 >> >> Ce courrier et toute pièce jointe qu'il contient sont réservés au seul >> destinataire indiqué et peuvent renfermer des renseignements confidentiels >> et privilégiés. Si vous n'êtes pas le destinataire prévu, toute divulgation, >> duplication, utilisation ou distribution du courrier ou de toute pièce >> jointe est interdite. Si vous avez reçu cette communication par erreur, >> veuillez nous en aviser par courrier et détruire immédiatement le courrier >> et les pièces jointes. Merci. >> >> ****************************************************************************** >> ------------------------------------------------------------------------------- >> >> >> [[alternative HTML version deleted]] >> >> >> American Express made the following annotations on Wed Apr 25 2012 03:39:08 >> >> ****************************************************************************** >> >> "This message and any attachments are solely for the intended recipient and >> may contain confidential or privileged information. If you are not the >> intended recipient, any disclosure, copying, use, or distribution of the >> information included in this message and any attachments is prohibited. If >> you have received this communication in error, please notify us by reply >> e-mail and immediately and permanently delete this message and any >> attachments. Thank you." >> >> American Express a ajouté le commentaire suivant le Wed Apr 25 2012 03:39:08 >> >> Ce courrier et toute pièce jointe qu'il contient sont réservés au seul >> destinataire indiqué et peuvent renfermer des renseignements confidentiels >> et privilégiés. Si vous n'êtes pas le destinataire prévu, toute divulgation, >> duplication, utilisation ou distribution du courrier ou de toute pièce >> jointe est interdite. Si vous avez reçu cette communication par erreur, >> veuillez nous en aviser par courrier et détruire immédiatement le courrier >> et les pièces jointes. Merci. >> >> ****************************************************************************** >> ------------------------------------------------------------------------------- >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > American Express made the following annotations on Wed Apr 25 2012 08:18:52 > > ****************************************************************************** > > > "This message and any attachments are solely for the intended recipient and > may contain confidential or privileged information. If you are not the > intended recipient, any disclosure, copying, use, or distribution of the > information included in this message and any attachments is prohibited. If > you have received this communication in error, please notify us by reply > e-mail and immediately and permanently delete this message and any > attachments. Thank you." > > American Express a ajouté le commentaire suivant le Wed Apr 25 2012 08:18:52 > > Ce courrier et toute pièce jointe qu'il contient sont réservés au seul > destinataire indiqué et peuvent renfermer des renseignements confidentiels et > privilégiés. Si vous n'êtes pas le destinataire prévu, toute divulgation, > duplication, utilisation ou distribution du courrier ou de toute pièce jointe > est interdite. Si vous avez reçu cette communication par erreur, veuillez > nous en aviser par courrier et détruire immédiatement le courrier et les > pièces jointes. Merci. > > ****************************************************************************** > > ------------------------------------------------------------------------------- > [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.