I haven't looked at the size-time relationship, but im2 (below) is faster than your function on at least one example:
intersectMat <- function(mat1, mat2) { #mat1 and mat2 are both deduplicated nr1 <- nrow(mat1) nr2 <- nrow(mat2) mat2[duplicated(rbind(mat1, mat2))[(nr1 + 1):(nr1 + nr2)], , drop=FALSE] } im2 <- function(mat1, mat2) { stopifnot(ncol(mat1)==2, ncol(mat1)==ncol(mat2)) toChar <- function(twoColMat) paste(sep="\1", twoColMat[,1], twoColMat[,2]) mat1[match(toChar(mat2), toChar(mat1), nomatch=0), , drop=FALSE] } > m1 <- cbind(1:1e7, rep(1:10, len=1e7)) > m2 <- cbind(1:1e7, rep(1:20, len=1e7)) > system.time(r1 <- intersectMat(m1,m2)) user system elapsed 430.37 1.96 433.98 > system.time(r2 <- im2(m1,m2)) user system elapsed 27.89 0.20 28.13 > identical(r1, r2) [1] TRUE > dim(r1) [1] 5000000 2 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf > Of c char > Sent: Monday, July 29, 2013 4:04 PM > To: r-help@r-project.org > Subject: [R] Intersecting two matrices > > Dear all, > > I am interested to know a faster matrix intersection package for R handles > intersection of two integer matrices with ncol=2. Currently I am using my > homemade code adapted from a previous thread: > > > intersectMat <- function(mat1, mat2){#mat1 and mat2 are both > deduplicated nr1 <- nrow(mat1) nr2 <- nrow(mat2) > mat2[duplicated(rbind(mat1, mat2))[(nr1 + 1):(nr1 + nr2)], ]} > > > which handles: > size A= 10578373 > size B= 9519807 > expected intersecting time= 251.2272 > intersecting for corssing MPRs took 409.602 seconds. > > scale a little bit worse than linearly but atomic operation is not good. > Wonder if a super fast C/C++ extension exists for this task. Your ideas are > appreciated. > > Thanks! > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.