I don't think you answered the OP's query, although I confess that I am not so sure I understand it either (see below). In any case, I believe the R level loop (i.e. apply()) is unnecessary. There is a unique (and a duplicated()) method for data frames, so simply
unique(x) returns a data frame with all the unique rows of x. However, I don't think that's what the OP wanted. (S)he appeared to want all unique combinations of 2 columns. If I got that right (??), combn(ncol(x),2) gives that and could be used for indexing. I'm not sure parallel processing is useful here, but then again, I may have misunderstood the query. If so, my apologies, and feel free to ignore all the above :-( . Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 "Data is not information. Information is not knowledge. And knowledge is certainly not wisdom." H. Gilbert Welch On Wed, Feb 5, 2014 at 3:26 PM, arun <smartpink...@yahoo.com> wrote: > Hi, > Try ?duplicated() > apply(x,2,function(x) {x[duplicated(x)]<-"";x}) > A.K. > > > > Hi all, > > I have a dataset of around a thousand column and a few thousands > of rows. I'm trying to get all the possible combinations (without > repetition) of the data columns and process them in parallel. Here's a > simplification of what my data and my code looks like: > > mydata <- structure(list(col1 = c(231L, 8946L, 534L), col2 = c(123L, 2361L, > 65L), col3 = c(5645L, 45L, 51L), col4 = c(654L, 356L, 32L), col5 = c(21L, > 1L, 51L), col6 = c(4L, 4515L, 15L), col7 = c(6L, 1L, 535L), col8 = c(894L, > 20L, 35L), col9 = c(68L, 21L, 123L), col10 = c(46L, 2L, 2L)), .Names = > c("col1", > "col2", "col3", "col4", "col5", "col6", "col7", "col8", "col9", > "col10"), class = "data.frame", row.names = c(NA, -3L)) > > require(foreach) > > x <- > foreach(m=1:5, .combine='cbind') %:% > foreach(j=(m+1):10, .combine='c') %do% { > paste(colnames(mydata)[m], colnames(mydata)[j]) > > } > > x > > > > if you execute the command above in R, you will get this result. > > > > result.1 result.2 result.3 result.4 result.5 > [1,] "col1 col2" "col2 col3" "col3 col4" "col4 col5" "col5 col6" > [2,] "col1 col3" "col2 col4" "col3 col5" "col4 col6" "col5 col7" > [3,] "col1 col4" "col2 col5" "col3 col6" "col4 col7" "col5 col8" > [4,] "col1 col5" "col2 col6" "col3 col7" "col4 col8" "col5 col9" > [5,] "col1 col6" "col2 col7" "col3 col8" "col4 col9" "col5 col10" > [6,] "col1 col7" "col2 col8" "col3 col9" "col4 col10" "col5 col6" > [7,] "col1 col8" "col2 col9" "col3 col10" "col4 col5" "col5 col7" > [8,] "col1 col9" "col2 col10" "col3 col4" "col4 col6" "col5 col8" > [9,] "col1 col10" "col2 col3" "col3 col5" "col4 col7" "col5 col9" > > notice that first problem I face that in the last row of the > second column of the "x" matrix says "col2 col3" which is a repetition > of the first item (which happens also in all succeeding columns). I was > planning to have unique combinations of all columns, which obviously, > did not work. > > Can somebody please help me with this? My desired output would be > > > > result.1 result.2 result.3 result.4 result.5 > [1,] "col1 col2" "col2 col3" "col3 col4" "col4 col5" "col5 col6" > [2,] "col1 col3" "col2 col4" "col3 col5" "col4 col6" "col5 col7" > [3,] "col1 col4" "col2 col5" "col3 col6" "col4 col7" "col5 col8" > [4,] "col1 col5" "col2 col6" "col3 col7" "col4 col8" "col5 col9" > [5,] "col1 col6" "col2 col7" "col3 col8" "col4 col9" > [6,] "col1 col7" "col2 col8" "col3 col9" > [7,] "col1 col8" "col2 col9" > [8,] "col1 col9" "col2 col10" > [9,] "col1 col10" > > > Many thanks > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.