Hello everyone, David, sorry for the confusion. I meant keeping the inner-order of column "b" inside the new order of column "a".
Chuck, Don and Greg - Each of you gave me a new way to approach this that I didn't think about - thank you very much for your help! I ran a competition between all the available solutions, and Chucks code did it the fastest so I will use his. Here are the results (in seconds) of running the code a thousand times: test replications elapsed 2 func.chuck(xx) 1000 0.45 4 func.david(xx) 1000 3.86 3 func.don(xx) 1000 3.92 1 func.tal(xx) 1000 6.09 Thanks all of you again! Tal On Thu, Aug 20, 2009 at 8:04 PM, Greg Snow <greg.s...@imail.org> wrote: > Here is a one liner: > > (yy <- do.call( rbind, sample( split(xx, xx$a) ) )) > > Basically reading from inside out, it splits the data frame by a (keeping > the structure of b intact within each data frame) and returns it as a list, > then that list is randomized, then put back together into a single data > frame again. > > Does this do what you want? > > -- > Gregory (Greg) L. Snow Ph.D. > Statistical Data Center > Intermountain Healthcare > greg.s...@imail.org > 801.408.8111 > > > > -----Original Message----- > > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- > > project.org] On Behalf Of Tal Galili > > Sent: Thursday, August 20, 2009 9:22 AM > > To: r-help@r-project.org > > Subject: [R] simple randomization question: How to perform "sample" in > > chunks > > > > Hello dear R-help group. > > > > My task looks simple, but I can't seem to find a "smart" (e.g: non > > loop) > > solution to it. > > > > Task: I wish to randomize a data.frame by one column, while keeping the > > inner-order in the second column as is. > > > > So for example, let's say I have the following data.frame: > > > > xx <-data.frame(a= c(1,2,2,3,3,3,4,4,4,4) , > > b = c(1,1,2,1,2,3,1,2,3,4) ) > > > > I would like to shuffle it by column "a", while keeping the order in > > column > > "b". > > > > Here is my "not-smart" way of doing it: > > > > # R example > > xx <-data.frame(a= c(1,2,2,3,3,3,4,4,4,4) , > > b = c(1,1,2,1,2,3,1,2,3,4) ) > > > > randomize.by.column.a <- function(xx) > > { > > new.a.order <- sample(unique(xx$a)) > > new.xx <- NULL > > for(i in new.a.order) > > { > > xx.subset <- xx[ xx$a %in% i ,] > > new.xx <- rbind(new.xx , xx.subset) > > } > > > > return(new.xx) > > } > > randomize.by.column.a(xx) > > # END of - R example > > > > > > > > I would love for a better, faster, way of doing it. > > > > Thanks, > > Tal > > > > > > > > > > > > > > > > > > > > > > -- > > ---------------------------------------------- > > > > > > My contact information: > > Tal Galili > > Phone number: 972-50-3373767 > > FaceBook: Tal Galili > > My Blogs: > > http://www.r-statistics.com/ > > http://www.talgalili.com > > http://www.biostatistics.co.il > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting- > > guide.html > > and provide commented, minimal, self-contained, reproducible code. > -- ---------------------------------------------- My contact information: Tal Galili Phone number: 972-50-3373767 FaceBook: Tal Galili My Blogs: http://www.r-statistics.com/ http://www.talgalili.com http://www.biostatistics.co.il [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.