Neat piece of code, Jim, but it still uses a nested loop. If you order the matrix first, you only need one passage through the whole matrix to find the information you need.
Off course I don't take into account the ordering. If the ordering algorithm doesn't work in linear time, then it doesn't really matter I guess. The limiting step would become the ordering algorithm. Kind regards Joris On Thu, Oct 8, 2009 at 2:24 PM, jim holtman <jholt...@gmail.com> wrote: > I answered the wrong question. Here is the code to find all the > matches for each row: > > n <- 20 > set.seed(2) > # create test dataframe > x <- as.data.frame(matrix(sample(1:2,n*6, TRUE), nrow=n)) > x > x.col <- c(1,3,5) > > # match against all the other rows > x.match1 <- apply(x[, x.col], 1, function(a){ > .mat <- which(apply(x[, x.col], 1, function(z){ > all(a == z) > })) > }) > > # remove matches to itself > x.match2 <- lapply(seq(length(x.match1)), function(z){ > x.match1[[z]][!(x.match1[[z]] %in% z)] > }) > # x.match2 contains which rows indices match > > > > > > > > > > > On Wed, Oct 7, 2009 at 3:52 PM, Rama Ramakrishnan <r...@alum.mit.edu> wrote: >> >> Hi Friends, >> >> I have a data frame d. Let vars be the column indices for a subset of the >> columns in d (e.g., vars <- c(1,3,4,8)) >> >> For each row r in d, I want to collect all the other rows in d that match >> the values in row r for just the columns in vars. >> >> The naive way to do this is to have a for loop stepping through each row in >> d, and within the loop have another loop going through all the rows again, >> checking for equality. This is quadratic in the number of rows and takes way >> too long. Is there a better, "vectorized" way to do this? >> >> Thanks in advance! >> >> Rama Ramakrishnan >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem that you are trying to solve? > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.