Don't know whats wrong there (except if you're using the eclipse R plugin on a 
mac like me and the window for choosing the download site doesn't pop up..  did 
it?^^)

Anyway, you could just split all of your data into 2 datasets, one that has all 
the data labeled 0, the other for all labeled 1, then take a random 80% of 
both, put them back together into the 80% data, and put the rest back together 
to form the 20%.

Since i don't know your data, heres an example:
M<-cbind(c(rep(0,10),rep(1,10)),1:20)
> M
      [,1] [,2]
 [1,]    0    1
 [2,]    0    2
 [3,]    0    3
 [4,]    0    4
 [5,]    0    5
 [6,]    0    6
 [7,]    0    7
 [8,]    0    8
 [9,]    0    9
[10,]    0   10
[11,]    1   11
[12,]    1   12
[13,]    1   13
[14,]    1   14
[15,]    1   15
[16,]    1   16
[17,]    1   17
[18,]    1   18
[19,]    1   19
[20,]    1   20

index1<-which(M[,1]==1)
> index1
 [1] 11 12 13 14 15 16 17 18 19 20

> M1<-M[index1,]
> M1
      [,1] [,2]
 [1,]    1   11
 [2,]    1   12
 [3,]    1   13
 [4,]    1   14
 [5,]    1   15
 [6,]    1   16
 [7,]    1   17
 [8,]    1   18
 [9,]    1   19
[10,]    1   20

> M0<-M[-index1,]
> M0
      [,1] [,2]
 [1,]    0    1
 [2,]    0    2
 [3,]    0    3
 [4,]    0    4
 [5,]    0    5
 [6,]    0    6
 [7,]    0    7
 [8,]    0    8
 [9,]    0    9
[10,]    0   10

> s1<-sample(1:dim(M1)[1],0.8*dim(M1)[1])
> s1
[1] 10  3  5  9  2  6  8  7

> s0<-sample(1:dim(M0)[1],0.8*dim(M0)[1])
> s0
[1]  8 10  9  3  7  4  2  1

> data80<-rbind(M1[s1,],M0[s0,])
> data80
      [,1] [,2]
 [1,]    1   20
 [2,]    1   13
 [3,]    1   15
 [4,]    1   19
 [5,]    1   12
 [6,]    1   16
 [7,]    1   18
 [8,]    1   17
 [9,]    0    8
[10,]    0   10
[11,]    0    9
[12,]    0    3
[13,]    0    7
[14,]    0    4
[15,]    0    2
[16,]    0    1

> data20<-rbind(M1[-s1,],M0[-s0,])
> data20
     [,1] [,2]
[1,]    1   11
[2,]    1   14
[3,]    0    5
[4,]    0    6


which is probably not how you really do things efficiently, but it should work.

greetings
Jessi


Am 25.04.2012 um 17:18 schrieb Dwaipayan Dasgupta:

> Thank you so much for replying. I tried what you said but it still throws the 
> same error i.e could not find function "sample.split"
> Might be because of the version of R I am running (R version 2.12.2).i do not 
> have admin rights to upgrade to the newest version.
> Is there anything else I can try? Im trying to split my data into 80:20 
> keeping the ratio of 0,1 in the Y variable(binary) constant.
> 
> -----Original Message-----
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf Of Jessica Streicher
> Sent: Wednesday, April 25, 2012 7:17 PM
> To: r-help@r-project.org
> Subject: Re: [R] Splitting data into test and train (80:20) kepping 
> attributes similar
> 
> Well, it throws an error, because there is no such function in default R. A 
> bit of googling showed it might be the one in the caTools package.
> 
> execute this:
> install.packages("caTools")
> library(caTools)
> 
> before executing your code
> 
> 
> Am 25.04.2012 um 12:39 schrieb Dwaipayan Dasgupta:
> 
>> Hi,
>> Could someone help me with this please , im trying to use 
>> Y = Attrition_data[,1] # extract labels from the data 
>> msk = sample.split (Y, SplitRatio=3/4)
>> table(Y,msk) 
>> to do the splitting but it keeps throwing up and error 
>> Error: could not find function "sample.split"
>> Could you please help
>> 
>> Thanks in advance
>> doy
>> 
>> 
>> -----Original Message-----
>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
>> Behalf Of Dwaipayan Dasgupta
>> Sent: Tuesday, April 24, 2012 9:08 PM
>> To: r-help@r-project.org
>> Subject: [R] Splitting data into test and train (80:20) kepping attributes 
>> similar
>> 
>> Hi,
>> I am trying to do some predictive modeling around attrition and want to 
>> split the dataset into test and train (80:20) and keep the ratio of 
>> attritees:non attrites same.
>> In my dataset the attrition indicator is coded as 0(for non-attritees) and 1 
>> (for attritees) and I want to keep the ratio of 0's to 1 similar.
>> I apologize for this trivial question but this is my second week with R.
>> 
>> Thanks,
>> Doy
>> 
>> 
>> 
>> 
>> 
>> American Express made the following annotations on Tue Apr 24 2012 08:38:50 
>> 
>> ******************************************************************************
>> 
>> "This message and any attachments are solely for the intended recipient and 
>> may contain confidential or privileged information. If you are not the 
>> intended recipient, any disclosure, copying, use, or distribution of the 
>> information included in this message and any attachments is prohibited. If 
>> you have received this communication in error, please notify us by reply 
>> e-mail and immediately and permanently delete this message and any 
>> attachments. Thank you." 
>> 
>> American Express a ajouté le commentaire suivant le Tue Apr 24 2012 08:38:50 
>> 
>> Ce courrier et toute pièce jointe qu'il contient sont réservés au seul 
>> destinataire indiqué et peuvent renfermer des renseignements confidentiels 
>> et privilégiés. Si vous n'êtes pas le destinataire prévu, toute divulgation, 
>> duplication, utilisation ou distribution du courrier ou de toute pièce 
>> jointe est interdite. Si vous avez reçu cette communication par erreur, 
>> veuillez nous en aviser par courrier et détruire immédiatement le courrier 
>> et les pièces jointes. Merci. 
>> 
>> ******************************************************************************
>> -------------------------------------------------------------------------------
>> 
>> 
>>      [[alternative HTML version deleted]]
>> 
>> 
>> American Express made the following annotations on Wed Apr 25 2012 03:39:08
>> 
>> ******************************************************************************
>> 
>> "This message and any attachments are solely for the intended recipient and 
>> may contain confidential or privileged information. If you are not the 
>> intended recipient, any disclosure, copying, use, or distribution of the 
>> information included in this message and any attachments is prohibited. If 
>> you have received this communication in error, please notify us by reply 
>> e-mail and immediately and permanently delete this message and any 
>> attachments. Thank you." 
>> 
>> American Express a ajouté le commentaire suivant le Wed Apr 25 2012 03:39:08 
>> 
>> Ce courrier et toute pièce jointe qu'il contient sont réservés au seul 
>> destinataire indiqué et peuvent renfermer des renseignements confidentiels 
>> et privilégiés. Si vous n'êtes pas le destinataire prévu, toute divulgation, 
>> duplication, utilisation ou distribution du courrier ou de toute pièce 
>> jointe est interdite. Si vous avez reçu cette communication par erreur, 
>> veuillez nous en aviser par courrier et détruire immédiatement le courrier 
>> et les pièces jointes. Merci. 
>> 
>> ******************************************************************************
>> -------------------------------------------------------------------------------
>> 
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> American Express made the following annotations on Wed Apr 25 2012 08:18:52 
> 
> ******************************************************************************
>  
> 
> "This message and any attachments are solely for the intended recipient and 
> may contain confidential or privileged information. If you are not the 
> intended recipient, any disclosure, copying, use, or distribution of the 
> information included in this message and any attachments is prohibited. If 
> you have received this communication in error, please notify us by reply 
> e-mail and immediately and permanently delete this message and any 
> attachments. Thank you." 
> 
> American Express a ajouté le commentaire suivant le Wed Apr 25 2012 08:18:52 
> 
> Ce courrier et toute pièce jointe qu'il contient sont réservés au seul 
> destinataire indiqué et peuvent renfermer des renseignements confidentiels et 
> privilégiés. Si vous n'êtes pas le destinataire prévu, toute divulgation, 
> duplication, utilisation ou distribution du courrier ou de toute pièce jointe 
> est interdite. Si vous avez reçu cette communication par erreur, veuillez 
> nous en aviser par courrier et détruire immédiatement le courrier et les 
> pièces jointes. Merci. 
> 
> ******************************************************************************
>  
> -------------------------------------------------------------------------------
> 


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to