Hi, Thanks for the solution. But I am afraid that after running this code still it takes more time. It has been an hour and still it is executing. I understand the delay because each triplet has to compare almost 9000 elements.
Regards, Sri On Wed, Jul 27, 2016 at 9:02 PM, Sarah Goslee <sarah.gos...@gmail.com> wrote: > Hi, > > It's really a good idea to use dput() or some other reproducible way > to provide data. I had to guess as to what your data looked like. > > It appears that order doesn't matter? > > Given than, here's one approach: > > combs <- structure(list(V1 = c(65L, 77L, 55L, 23L, 34L), V2 = c(23L, 34L, > 34L, 77L, 65L), V3 = c(77L, 65L, 23L, 34L, 55L)), .Names = c("V1", > "V2", "V3"), class = "data.frame", row.names = c(NA, -5L)) > > dat <- list( > c(77,65,34,23,55), > c(65,23,77,65,55,34), > c(77,34,65), > c(55,78,56), > c(98,23,77,65,34)) > > > sapply(seq_len(nrow(combs)), function(i)sum(sapply(dat, > function(j)all(combs[i,] %in% j)))) > > On a dataset of comparable time to yours, it takes me under a minute and a > half. > > > combs <- combs[rep(1:nrow(combs), length=100), ] > > dat <- dat[rep(1:length(dat), length=10000)] > > > > dim(combs) > [1] 100 3 > > length(dat) > [1] 10000 > > > > system.time(test <- sapply(seq_len(nrow(combs)), > function(i)sum(sapply(dat, function(j)all(combs[i,] %in% j))))) > user system elapsed > 86.380 0.006 86.391 > > > > > On Wed, Jul 27, 2016 at 10:47 AM, sri vathsan <srivib...@gmail.com> wrote: > > Hi, > > > > Apologizes for the less information. > > > > Basically, myCombos is a matrix with 3 variables which is a triplet that > is > > a combination of 79 codes. There are around 3lakh combination as such and > > it looks like below. > > > > V1 V2 V3 > > 65 23 77 > > 77 34 65 > > 55 34 23 > > 23 77 34 > > 34 65 55 > > > > Each triplet will compare in a list (mylist) having 8177 elements which > > will looks like below. > > > > 77,65,34,23,55 > > 65,23,77,65,55,34 > > 77,34,65 > > 55,78,56 > > 98,23,77,65,34 > > > > Now I want to count the no of occurrence of the triplet in the above > list. > > I.e., the triplet 65 23 77 is seen 3 times in the list. So my output > looks > > like below > > > > V1 V2 V3 Freq > > 65 23 77 3 > > 77 34 65 4 > > 55 34 23 2 > > > > I hope, I made it clear this time. > > > > > > On Wed, Jul 27, 2016 at 7:00 PM, Bert Gunter <bgunter.4...@gmail.com> > wrote: > > > >> Not entirely sure I understand, but match() is already vectorized, so > you > >> should be able to lose the supply(). This would speed things up a lot. > >> Please re-read ?match *carefully* . > >> > >> Bert > >> > >> On Jul 27, 2016 6:15 AM, "sri vathsan" <srivib...@gmail.com> wrote: > >> > >> Hi, > >> > >> I created list of 3 combination numbers (mycombos, around 3 lakh > >> combinations) and counting the occurrence of those combination in > another > >> list. This comparision list (mylist) is having around 8000 records.I am > >> using the following code. > >> > >> myCounts <- sapply(1:nrow(myCombos), FUN=function(i) { > >> sum(sapply(myList, function(j) { > >> sum(!is.na(match(c(myCombos[i,]), j)))})==3)}) > >> > >> The above code takes very long time to execute and is there any other > >> effecting method which will reduce the time. > >> -- > >> > >> Regards, > >> Srivathsan.K > >> > -- Regards, Srivathsan.K Phone : 9600165206 [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.