So rows are considered duplicated if they have the same two characters, regardless of which column they're in?
If the B A row came first is it ok to keep that row, or would you want to keep the A B row? This appears to work, at least for this example. foo <- t(apply(test,1, function(x) sort(format(x)) )) test[ !duplicated(foo),] a u 1 A B 2 A C 4 B F 6 D W Note that the function sorts the formatted value, in case the factor levels are such that they don't sort alphabetically. Notice also that in the result, the second column ('u') is still a factor, and its levels still include 'A', even though A no longer is present in the column. Whether or not that's wanted, I couldn't say. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 11/15/13 1:26 PM, "Hermann Norpois" <hnorp...@gmail.com> wrote: >Hello, > >I am looking for a method to eliminate rows dupblicates in a backwards >manner, for instance: >I want to keep A B but not B A (see my data.frame test). >Thanks >Hermann > >> test > a u >1 A B >2 A C >3 B A >4 B F >5 C A >6 D W >> dput (test) >structure(list(a = structure(c(1L, 1L, 2L, 2L, 3L, 4L), .Label = c("A", >"B", "C", "D"), class = "factor"), u = structure(c(2L, 3L, 1L, >4L, 1L, 5L), .Label = c("A", "B", "C", "F", "W"), class = "factor")), >.Names = c("a", >"u"), row.names = c(NA, -6L), class = "data.frame") > > [[alternative HTML version deleted]] > >______________________________________________ >R-help@r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.