> -----Original Message----- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Michael Kogan > Sent: Saturday, August 22, 2009 11:45 AM > To: r-help@r-project.org > Subject: [R] Help on comparing two matrices > > Hi, > > I need to compare two matrices with each other. If you can get one of > them out of the other one by resorting the rows and/or the > columns, then > both of them are equal, otherwise they're not. A matrix could > look like > this: > > [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] > [1,] 0 1 1 1 0 1 1 0 > [2,] 1 1 0 0 0 1 0 1 > [3,] 1 0 1 0 0 0 1 1 > [4,] 1 1 0 0 1 0 0 0 > [5,] 1 0 1 1 1 0 0 0 > [6,] 0 1 0 1 1 0 0 0 > [7,] 0 0 0 0 0 1 1 1 > > Note that each matrix consists of ones and zeros, in each row and in > each column there are at least three ones and one zero and > each pair of > rows/columns may have at most two positions where both are > ones (e.g. > for the 1. and 2. rows those positions are 2 and 6). > > I was advised to sort both matrices in the same way and then > to compare > them element by element. But I don't manage to get them sorted... My > approach is as following: > > 1. Sort the rows after the row sums (greater sums first). > 2. Sort the columns after the first column (columns with ones in the > first row go left, columns with zeros go right). > 3. Save the left part (all columns with ones in the first > row) and the > right part in separate matrices. > 4. Repeat steps 2 and 3 with both of the created matrices (now taking > the second row for sorting), repeat until all fragments consist of a > single column. > 5. Compose the columns to a sorted matrix. > > This algorithm has several problems: > > 1. How to make a loop that is branching out in two subloops on each > iteration? > 2. How to organize the intermediate results and compose them without > losing the order? Maybe save them in lists and sublists? > 3. A fundamental problem: If there are rows with equal sums > the result > may depend on which of them is sorted after first. Maybe this > algorithm > won't work at all because of this problem? > > Thanks in advance for your input, > Michael >
Michael, I see that you have received a number of responses to your request, and you may have already solved your problem. It seems to me that the responses so far don't guarantee finding a match if both row and column exchanges are necessary. I put together some code as an R learning task for me. It is only partially automated. Someone with more R programming skills than me could probably totally automate this process. I think your initial plan of sorting by row total was moving in the right direction. I took your original matrix, m, and created a matrix, x, as a random ordering of rows and columns of m. Then sorted m and x by both row and column totals. Next, all possible orderings of the ordered x matrix, x.o, were constructed by permuting rows (and columns) with the same row ( or column) totals and were compared to the ordered m matrix, m.o . I used the permutations function from the gtools package, so you would need to install and load that package. If a match was found, TRUE was printed out. A better R programmer than me could probably finish automating the entire process, wrapping it all up in a nice function, but this at least gives you a framework for a solution. require(gtools) m <- matrix(c( 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1), 7, 8, byrow = TRUE) ##----create comparison matrix by randomly ordering rows and columns (should compare TRUE)----## i <- sample(1:nrow(m)) j <- sample(1:ncol(m)) x <- m[i,j] ##----if row and column totals aren't the same, then matrices can't be "equal"----## # no need to proceed if false identical(sort(apply(m,1,sum)),sort(apply(x,1,sum))) & identical(sort(apply(m,2,sum)),sort(apply(x,2,sum))) ##----Order both arrays by row and column totals----## m.o <- m[order(apply(m,1,sum)),order(apply(m,2,sum))] x.o <- x[order(apply(x,1,sum)),order(apply(x,2,sum))] ##----build row permutation list----## # these are groups of rows that have same sum rowtot <- rle(apply(x.o,1,sum))$lengths rb <- list() begin <- 1 for(i in 1:length(rowtot)){ rb[[i]] <- permutations(rowtot[i],rowtot[i],v=begin:cumsum(rowtot)[i]) begin <- begin + rowtot[i] } # construction of rperm could (should) be automated rperm <- cbind(rb[[1]][rep(1:nrow(rb[[1]]), each = nrow(rb[[2]])), ], rb[[2]][rep(1:nrow(rb[[2]]), nrow(rb[[1]])), ] ) rperm <- cbind(rperm[rep(1:nrow(rperm), each = nrow(rb[[3]])), ], rb[[3]][rep(1:nrow(rb[[3]]), nrow(rperm)), ] ) ##----build column permutation list----## # these are groups of columns that have same sum coltot <- rle(apply(x.o,2,sum))$lengths cb <- list() begin <- 1 for(i in 1:length(coltot)){ cb[[i]] <- permutations(coltot[i],coltot[i],v=begin:cumsum(coltot)[i]) begin <- begin + coltot[i] } # construction of cperm could (should) be automated cperm <- cbind(cb[[1]][rep(1:nrow(cb[[1]]), each = nrow(cb[[2]])), ], cb[[2]][rep(1:nrow(cb[[2]]), nrow(cb[[1]])), ] ) ##----compare m.o with all x.o where rows and columns of x.o with tied totals are permuted----## for(i in 1:nrow(rperm)){ for(j in 1:nrow(cperm)){ if(identical(m.o,x.o[rperm[i,],cperm[j,]])) { cat('TRUE','\n') break } } } Hope this is helpful, Dan Daniel Nordlund Bothell, WA USA ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.