Thanks, that does the trick. Again a new command learned. Thanks. However, any hints regarding the rownames issue?
BR Thorn > -----Original Message----- > From: Dimitris Rizopoulos [mailto:d.rizopou...@erasmusmc.nl] > Sent: lundi 9 août 2010 11:07 > To: Thaler,Thorn,LAUSANNE,Applied Mathematics > Cc: r-help@r-project.org > Subject: Re: [R] Smart Indexing > > I think you just need merge(), e.g. > > a <- data.frame(id = rep(1:3, each=3), val = rnorm(9)) > b <- data.frame(id = 1:3, set1 = LETTERS[1:3], set2 = 5:7) > > merge(a, b, by = "id") > > > I hope it helps. > > Best, > Dimitris > > > On 8/9/2010 11:01 AM, Thaler, Thorn, LAUSANNE, Applied Mathematics > wrote: > > Hi all, > > > > Suppose that I've two data frames, a and b say, both containing a > column > > 'id'. While data frame 'a' contains multiple rows sharing the same > id, > > data frame 'b' contains just one entry per id (i.e. a 1 to n > > relationship). For the ease of modeling I now want to generate a new > > data frame c, which is basically a copy of data frame 'a' augmented > by > > the values of b. If I have > > > > a<- data.frame(id = rep(1:3, each=3), val=rnorm(9)) > > b<- data.frame(id=1:3, set1=LETTERS[1:3], set2=5:7) > > > > the resulting data frame should look like: > > > > c<- data.frame(id = rep(1:3, each=3), val = a$val, > > set1=rep(LETTERS[1:3], each=3), set2 = rep(5:7, each = 3)) > > > > While this task is just an application of some 'rep's and 'c's for > > structured data frames, it is somehow cumbersome (and error prone) to > > construct 'c' explicitly for less structured data. Thus, I was > thinking > > of making use of R's smart indexing possibilities to generate an > index > > vector, i.e.: > > > > ind<- c(1, 1, 1, 2, 2, 2, 3, 3, 3) > > c.prime<- cbind(a, b[ind,-1]) > > rownames(c.prime)<- NULL > > all.equal(c.prime , c) # TRUE > > > > The way I generate the index vector ind for the moment is > > > > tmp<- seq_along(b$id) > > names(tmp)<- b$id > > ind<- tmp[a$id] > > > > However, I think that there should be a smarter way of doing that > > without the need of defining a temporary variable. Some combination > of > > match, which, %in% maybe? Any hints? > > > > While writing these lines, I think > > > > ind<- pmatch(a$id, b$id, duplicates=T) > > > > could do the job? Or do I run into troubles regarding the "partial > > matching" involved in pmatch? > > > > BTW, is there a way to prevent R of assigning [row|col]names? In the > > example above I had to remove the rownames generated by rbind > > explicitly, is there an one-liner? > > > > Thanks for your input + BR > > > > Thorn > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > -- > Dimitris Rizopoulos > Assistant Professor > Department of Biostatistics > Erasmus University Medical Center > > Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands > Tel: +31/(0)10/7043478 > Fax: +31/(0)10/7043014 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.