On 21/03/2009 12:01 PM, Donald Macnaughton wrote:
I have a data frame with roughly 500 rows and 120 variables. I would like
to generate a new data frame that will include one row for each PAIR of
rows in the original data frame and will include all 120 + 120 = 240
variables from the two rows. I need only one row for each pair, not two
rows. Thus the new data frame will contain 500 x 499 / 2 = 124,750 rows.
Is there an easy way to do this with R?
Probably the easiest is to generate row indices for each pair, e.g.
n <- nrow(mydata)
row1 <- rep(1:n, n)
row2 <- rep(1:n, each=n)
keep <- row1 < row2
big <- cbind(mydata[row1[keep],], mydata[row2[keep],])
With a simple example
> mydata <- data.frame(a=1:3, b=letters[1:3])
> mydata
a b
1 1 a
2 2 b
3 3 c
this produces
> big
a b a b
1 1 a 2 b
1.1 1 a 3 c
2 2 b 3 c
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.