If I have correctly understood what Robin wants, I don't think Gabor's elaborate solution is necessary (though I grant that it may be more general). It's also slow due to the "apply's". A more straightforward and faster approach is to convert the rows to individual character strings via paste() and then use table() and match() to get your counts.
## Example data tbl <- matrix(sample(0:9,24,rep=TRUE),ncol=3) ## sample space obs <- tbl[sample(1:6,12,rep=TRUE),] ## sample ## Now the code ## Use paste to convert each row into a character string tblRow <- do.call(paste,c(data.frame(tbl),sep=".")) obsRow <- do.call(paste,c(data.frame(obs),sep=".")) d <- rep(0,nrow(tbl)) ## initialize vector of counts counts <- table(obsRow) ## Let (the fast) table() do the work d[match(names(counts),tblRow)] <- counts ## vector of counts ## here are results from a sample run: > tbl [,1] [,2] [,3] [1,] 6 3 4 [2,] 0 6 0 [3,] 4 2 7 [4,] 0 3 3 [5,] 9 0 2 [6,] 7 8 9 [7,] 7 5 3 [8,] 7 1 8 > obs [,1] [,2] [,3] [1,] 9 0 2 [2,] 0 3 3 [3,] 6 3 4 [4,] 9 0 2 [5,] 7 8 9 [6,] 0 6 0 [7,] 4 2 7 [8,] 0 6 0 [9,] 9 0 2 [10,] 9 0 2 [11,] 7 8 9 [12,] 7 8 9 > d [1] 1 2 1 1 4 3 0 0 There may well be more elegant ways to do this, too. Cheers, Bert Gunter Genentech Nonclinical Biostatistics -----Original Message----- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Gabor Grothendieck Sent: Friday, October 16, 2009 4:27 AM To: Robin Hankin Cc: r-help@r-project.org Subject: Re: [R] generalization of tabulate() Using the generalized inner product defined in this post: https://www.stat.math.ethz.ch/pipermail/r-help/2006-July/109311.html try this: cbind(S, d = rowSums(inner(S, obs, identical))) On Fri, Oct 16, 2009 at 4:29 AM, Robin Hankin <rk...@cam.ac.uk> wrote: > Hi > > I want a generalization of tabulate() which works on rows of a matrix. > Suppose I have an integer matrix 'observation': > >> observation > > y1 y2 y3 > 1 4 0 > 1 4 0 > 2 0 3 > 4 1 0 > 0 5 0 > 0 1 4 > 2 0 3 > > Each row corresponds to a (multivariate) observation. Note that the > first two rows are identical: this means that data "c(1,4,0)" was > observed twice. > > Now suppose I can list the sample space: > >> S > [1,] 5 0 0 > [2,] 4 1 0 > [3,] 3 2 0 > [4,] 2 3 0 > [5,] 1 4 0 > [6,] 0 5 0 > [7,] 4 0 1 > [8,] 3 1 1 > [9,] 2 2 1 > [10,] 1 3 1 > [11,] 0 4 1 > [12,] 3 0 2 > [13,] 2 1 2 > [14,] 1 2 2 > [15,] 0 3 2 > [16,] 2 0 3 > [17,] 1 1 3 > [18,] 0 2 3 > [19,] 1 0 4 > [20,] 0 1 4 > [21,] 0 0 5 > > (thus each row corresponds to a point in my sample space). > > Now what I need to do is to construct a new matrix, which uses the > 'observation' matrix above, which is a sort of table: > >> desired > > y1 y2 y3 d > [1,] 5 0 0 0 > [2,] 4 1 0 1 > [3,] 3 2 0 0 > [4,] 2 3 0 0 > [5,] 1 4 0 2 > [6,] 0 5 0 1 > [7,] 4 0 1 0 > [8,] 3 1 1 0 > [9,] 2 2 1 0 > [10,] 1 3 1 0 > [11,] 0 4 1 0 > [12,] 3 0 2 0 > [13,] 2 1 2 0 > [14,] 1 2 2 0 > [15,] 0 3 2 0 > [16,] 2 0 3 2 > [17,] 1 1 3 0 > [18,] 0 2 3 0 > [19,] 1 0 4 0 > [20,] 0 1 4 1 > [21,] 0 0 5 0 > > > Thus the 'd' column counts the number of times that each row occurs in > variable 'observation'. So desired[5,4]=2 because the observation > corresponding to desired[5,1:3] (viz c(1,4,0)) occurred twice. And > desired[1,4]=0 because the observation corresponding to desired[1,1:3] > (viz c(5,0,0)) did not occur once (it was not observed). > > In my application I have dim(S) ~= c(5,4e6). > > I've tried merge(), stack(), reshape(), but the best I can do > is the (derisory): > > require(partitions) > > > obs <- matrix(as.integer(c( > 1, 4, 0, > 1, 4, 0, > 2, 0, 3, > 4, 1, 0, > 0, 5, 0, > 0, 1, 4, > 2, 0, 3 > )),ncol=3,byrow=TRUE) > > S <- t(compositions(5,3)) > d <- rep(0,nrow(S)) > > > for(i in seq_len(nrow(obs))){ > for(j in seq_len(nrow(S))){ > if(all(obs[i,,drop=TRUE] == S[j,,drop=TRUE])){ > d[j] <- d[j]+1 > } > } > } > > S <- cbind(S,d) > > > Anyone got anything better before I try C? > > > -- > Robin K. S. Hankin > Uncertainty Analyst > University of Cambridge > 19 Silver Street > Cambridge CB3 9EP > 01223-764877 > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.