Be reminded that s1 and s2 are only the indexes on AD_0 and AD_1 of the data 
which you want to keep.

therefore 

traindata <- rbind(s1,s2)

will not work.

you need to take data from AD_0 and AD_0 for that, similarly with what you did 
with s3 and s4.


Am 26.04.2012 um 12:56 schrieb Dwaipayan Dasgupta:

> Hi ,
> Thanks again for helping me out.
> Here is the code I am using
> Ad_1 <- subset(Attrition_data_1,Attrition_ind=="1")
> Ad_0 <- subset(Attrition_data_1,Attrition_ind=="0")
>  
> s1<-sample(1:dim(Ad_0)[1],0.8*dim(Ad_0)[1])# 80% of the non-attrites
> s2<-sample(1:dim(Ad_1)[1],0.8*dim(Ad_1)[1])# 80% of attritees
>  
> s3<- Ad_0 [-s1,]
> summary(s3)
>  
> s4<- Ad_1 [-s2,]
> summary(s4)
>  
> traindata <- rbind(s1,s2)
> testdata <- rbind(s3,s4)
>  
> this works for the test dataset but throws up an error of
> Warning message:
> In rbind(s1, s2) :
>   number of columns of result is not a multiple of vector length (arg 2)
>  
> I understand that I am trying to append vectors of unequal vector lengths but 
> don’t know how to work around this process.
> Would you help please
>  
> Thanks,
> Dwaipayan
>  
> From: Jessica Streicher [mailto:j.streic...@micromata.de] 
> Sent: Wednesday, April 25, 2012 9:25 PM
> To: Dwaipayan Dasgupta
> Cc: r-help@r-project.org
> Subject: Re: [R] Splitting data into test and train (80:20) kepping 
> attributes similar
>  
> Don't know whats wrong there (except if you're using the eclipse R plugin on 
> a mac like me and the window for choosing the download site doesn't pop up..  
> did it?^^)
>  
> Anyway, you could just split all of your data into 2 datasets, one that has 
> all the data labeled 0, the other for all labeled 1, then take a random 80% 
> of both, put them back together into the 80% data, and put the rest back 
> together to form the 20%.
>  
> Since i don't know your data, heres an example:
> M<-cbind(c(rep(0,10),rep(1,10)),1:20)
> > M
>       [,1] [,2]
>  [1,]    0    1
>  [2,]    0    2
>  [3,]    0    3
>  [4,]    0    4
>  [5,]    0    5
>  [6,]    0    6
>  [7,]    0    7
>  [8,]    0    8
>  [9,]    0    9
> [10,]    0   10
> [11,]    1   11
> [12,]    1   12
> [13,]    1   13
> [14,]    1   14
> [15,]    1   15
> [16,]    1   16
> [17,]    1   17
> [18,]    1   18
> [19,]    1   19
> [20,]    1   20
>  
> index1<-which(M[,1]==1)
> > index1
>  [1] 11 12 13 14 15 16 17 18 19 20
>  
> > M1<-M[index1,]
> > M1
>       [,1] [,2]
>  [1,]    1   11
>  [2,]    1   12
>  [3,]    1   13
>  [4,]    1   14
>  [5,]    1   15
>  [6,]    1   16
>  [7,]    1   17
>  [8,]    1   18
>  [9,]    1   19
> [10,]    1   20
>  
> > M0<-M[-index1,]
> > M0
>       [,1] [,2]
>  [1,]    0    1
>  [2,]    0    2
>  [3,]    0    3
>  [4,]    0    4
>  [5,]    0    5
>  [6,]    0    6
>  [7,]    0    7
>  [8,]    0    8
>  [9,]    0    9
> [10,]    0   10
>  
> > s1<-sample(1:dim(M1)[1],0.8*dim(M1)[1])
> > s1
> [1] 10  3  5  9  2  6  8  7
>  
> > s0<-sample(1:dim(M0)[1],0.8*dim(M0)[1])
> > s0
> [1]  8 10  9  3  7  4  2  1
>  
> > data80<-rbind(M1[s1,],M0[s0,])
> > data80
>       [,1] [,2]
>  [1,]    1   20
>  [2,]    1   13
>  [3,]    1   15
>  [4,]    1   19
>  [5,]    1   12
>  [6,]    1   16
>  [7,]    1   18
>  [8,]    1   17
>  [9,]    0    8
> [10,]    0   10
> [11,]    0    9
> [12,]    0    3
> [13,]    0    7
> [14,]    0    4
> [15,]    0    2
> [16,]    0    1
>  
> > data20<-rbind(M1[-s1,],M0[-s0,])
> > data20
>      [,1] [,2]
> [1,]    1   11
> [2,]    1   14
> [3,]    0    5
> [4,]    0    6
>  
>  
> which is probably not how you really do things efficiently, but it should 
> work.
>  
> greetings
> Jessi
>  
>  
> Am 25.04.2012 um 17:18 schrieb Dwaipayan Dasgupta:
> 
> 
> Thank you so much for replying. I tried what you said but it still throws the 
> same error i.e could not find function "sample.split"
> Might be because of the version of R I am running (R version 2.12.2).i do not 
> have admin rights to upgrade to the newest version.
> Is there anything else I can try? Im trying to split my data into 80:20 
> keeping the ratio of 0,1 in the Y variable(binary) constant.
> 
> -----Original Message-----
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf Of Jessica Streicher
> Sent: Wednesday, April 25, 2012 7:17 PM
> To: r-help@r-project.org
> Subject: Re: [R] Splitting data into test and train (80:20) kepping 
> attributes similar
> 
> Well, it throws an error, because there is no such function in default R. A 
> bit of googling showed it might be the one in the caTools package.
> 
> execute this:
> install.packages("caTools")
> library(caTools)
> 
> before executing your code
> 
> 
> Am 25.04.2012 um 12:39 schrieb Dwaipayan Dasgupta:
> 
> 
> Hi,
> Could someone help me with this please , im trying to use
> Y = Attrition_data[,1] # extract labels from the data
> msk = sample.split (Y, SplitRatio=3/4)
> table(Y,msk)
> to do the splitting but it keeps throwing up and error
> Error: could not find function "sample.split"
> Could you please help
>  
> Thanks in advance
> doy
>  
>  
> -----Original Message-----
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf Of Dwaipayan Dasgupta
> Sent: Tuesday, April 24, 2012 9:08 PM
> To: r-help@r-project.org
> Subject: [R] Splitting data into test and train (80:20) kepping attributes 
> similar
>  
> Hi,
> I am trying to do some predictive modeling around attrition and want to split 
> the dataset into test and train (80:20) and keep the ratio of attritees:non 
> attrites same.
> In my dataset the attrition indicator is coded as 0(for non-attritees) and 1 
> (for attritees) and I want to keep the ratio of 0's to 1 similar.
> I apologize for this trivial question but this is my second week with R.
>  
> Thanks,
> Doy
>  
>  
>  
>  
>  
> American Express made the following annotations on Tue Apr 24 2012 08:38:50
>  
> ******************************************************************************
>  
> "This message and any attachments are solely for the intended recipient and 
> may contain confidential or privileged information. If you are not the 
> intended recipient, any disclosure, copying, use, or distribution of the 
> information included in this message and any attachments is prohibited. If 
> you have received this communication in error, please notify us by reply 
> e-mail and immediately and permanently delete this message and any 
> attachments. Thank you."
>  
> American Express a ajouté le commentaire suivant le Tue Apr 24 2012 08:38:50
>  
> Ce courrier et toute pièce jointe qu'il contient sont réservés au seul 
> destinataire indiqué et peuvent renfermer des renseignements confidentiels et 
> privilégiés. Si vous n'êtes pas le destinataire prévu, toute divulgation, 
> duplication, utilisation ou distribution du courrier ou de toute pièce jointe 
> est interdite. Si vous avez reçu cette communication par erreur, veuillez 
> nous en aviser par courrier et détruire immédiatement le courrier et les 
> pièces jointes. Merci.
>  
> ******************************************************************************
> -------------------------------------------------------------------------------
>  
>  
>             [[alternative HTML version deleted]]
>  
>  
> American Express made the following annotations on Wed Apr 25 2012 03:39:08
>  
> ******************************************************************************
>  
> "This message and any attachments are solely for the intended recipient and 
> may contain confidential or privileged information. If you are not the 
> intended recipient, any disclosure, copying, use, or distribution of the 
> information included in this message and any attachments is prohibited. If 
> you have received this communication in error, please notify us by reply 
> e-mail and immediately and permanently delete this message and any 
> attachments. Thank you."
>  
> American Express a ajouté le commentaire suivant le Wed Apr 25 2012 03:39:08
>  
> Ce courrier et toute pièce jointe qu'il contient sont réservés au seul 
> destinataire indiqué et peuvent renfermer des renseignements confidentiels et 
> privilégiés. Si vous n'êtes pas le destinataire prévu, toute divulgation, 
> duplication, utilisation ou distribution du courrier ou de toute pièce jointe 
> est interdite. Si vous avez reçu cette communication par erreur, veuillez 
> nous en aviser par courrier et détruire immédiatement le courrier et les 
> pièces jointes. Merci.
>  
> ******************************************************************************
> -------------------------------------------------------------------------------
>  
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> American Express made the following annotations on Wed Apr 25 2012 08:18:52 
> 
> ******************************************************************************
>  
> 
> "This message and any attachments are solely for the intended recipient and 
> may contain confidential or privileged information. If you are not the 
> intended recipient, any disclosure, copying, use, or distribution of the 
> information included in this message and any attachments is prohibited. If 
> you have received this communication in error, please notify us by reply 
> e-mail and immediately and permanently delete this message and any 
> attachments. Thank you." 
> 
> American Express a ajouté le commentaire suivant le Wed Apr 25 2012 08:18:52 
> 
> Ce courrier et toute pièce jointe qu'il contient sont réservés au seul 
> destinataire indiqué et peuvent renfermer des renseignements confidentiels et 
> privilégiés. Si vous n'êtes pas le destinataire prévu, toute divulgation, 
> duplication, utilisation ou distribution du courrier ou de toute pièce jointe 
> est interdite. Si vous avez reçu cette communication par erreur, veuillez 
> nous en aviser par courrier et détruire immédiatement le courrier et les 
> pièces jointes. Merci. 
> 
> ******************************************************************************
>  
> -------------------------------------------------------------------------------
> 
>  
> American Express made the following annotations on Thu Apr 26 2012 03:56:52 
> ******************************************************************************
>  
> "This message and any attachments are solely for the intended recipient and 
> may contain confidential or privileged information. If you are not the 
> intended recipient, any disclosure, copying, use, or distribution of the 
> information included in this message and any attachments is prohibited. If 
> you have received this communication in error, please notify us by reply 
> e-mail and immediately and permanently delete this message and any 
> attachments. Thank you." 
> American Express a ajouté le commentaire suivant le Thu Apr 26 2012 03:56:52 
> Ce courrier et toute pièce jointe qu'il contient sont réservés au seul 
> destinataire indiqué et peuvent renfermer des renseignements confidentiels et 
> privilégiés. Si vous n'êtes pas le destinataire prévu, toute divulgation, 
> duplication, utilisation ou distribution du courrier ou de toute pièce jointe 
> est interdite. Si vous avez reçu cette communication par erreur, veuillez 
> nous en aviser par courrier et détruire immédiatement le courrier et les 
> pièces jointes. Merci. 
> ******************************************************************************
> 
> 
> -------------------------------------------------------------------------------
> 
> 
> 


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to