You do have to read a little further on the help page to make sure that duplicates are removed if they appear after, and not before, others in the vector to see that the order is preserved:
"Note that unlike the Unix command uniq this omits duplicated and not just repeated elements/rows. That is, an element is omitted if it is identical to any previous element and not just if it is the same as the immediately previous one. " This does make it clear that the original order is preserved since it is succeeding elements that are removed. So from this, I assume that the use of unique(x,y) does preserve the original ordering of the elements. On Sun, Nov 23, 2008 at 2:36 AM, Prof Brian Ripley <[EMAIL PROTECTED]> wrote: > On Sun, 23 Nov 2008, jim holtman wrote: > >> You are right. union used 'unique(c(x,y))' and I am not sure if >> 'unique' preserves the order, but the help page seems to indicate that >> "an element is omitted if it is identical to any previous element "; >> this might mean that the order is preserved. > > It says > > 'unique' returns a vector, data frame or array like 'x' but with > duplicate elements/rows removed. > > Although it is a generic function, it is hard to see how that can be > interpreted to allow the order to be changed. > > The claim that union would be more efficiently implemented via sorting is > made with no evidence: do look up a basic computer science textbook for this > kind of thing, as well as how R actually does it. (Also 'efficient' was not > defined: both speed and memory usage are potentially measures of > efficiency.) But for example > >> x <- rnorm(1e7) >> system.time(unique(x)) > > user system elapsed > 2.258 0.261 2.523 >> >> system.time(sort(x)) > > user system elapsed > 4.102 0.112 4.231 >> >> system.time(sort(x, method="quick")) > > user system elapsed > 1.928 0.109 2.047 > > will indicate that unique() is comparable in speed to sorting. > > >> >> On Sat, Nov 22, 2008 at 11:43 PM, Stavros Macrakis >> <[EMAIL PROTECTED]> wrote: >>> >>> On Sat, Nov 22, 2008 at 10:20 AM, jim holtman <[EMAIL PROTECTED]> wrote: >>>> >>>> c.Factor <- >>>> function (x, y) >>>> { >>>> newlevels = union(levels(x), levels(y)) >>>> m = match(levels(y), newlevels) >>>> ans = c(unclass(x), m[unclass(y)]) >>>> levels(ans) = newlevels >>>> class(ans) = "factor" >>>> ans >>>> } >>> >>> This algorithm depends crucially on union preserving the order of the >>> elements of its arguments. As far as I can tell, the spec of union >>> does not require this. If union were to (for example) sort its >>> arguments then merge them (generally a more efficient algorithm), this >>> function would no longer work. >>> >>> Fortunately, the fix is simple. Instead of union, use: >>> >>> newlevels <- c(levels(x),setdiff(levels(y),levels(x)) >>> >>> which is guaranteed to preserve the order of levels(x). >>> >>> -s >>> >> >> >> >> -- >> Jim Holtman >> Cincinnati, OH >> +1 513 646 9390 >> >> What is the problem that you are trying to solve? >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- > Brian D. Ripley, [EMAIL PROTECTED] > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.