mixedsort in gtools will give the same result as mysort(s) but differs in the case of t.
On Sun, Dec 21, 2008 at 8:33 PM, John Fox <j...@mcmaster.ca> wrote: > Dear r-helpers, > > I'm looking for a way of sorting variable names in a "natural" order, when > the names are composed of digits and other characters. I know that this is a > vague idea, and that sorting character strings is a complex topic, but > perhaps a couple of examples will clarify what I mean: > >> s <- c("x1b", "x1a", "x02b", "x02a", "x02", "y1a1", "y10a2", > + "y10a10", "y10a1", "y2", "var10a2", "var2", "y10") > >> sort(s) > [1] "var10a2" "var2" "x02" "x02a" "x02b" "x1a" > [7] "x1b" "y10" "y10a1" "y10a10" "y10a2" "y1a1" > [13] "y2" > >> mysort(s) > [1] "var2" "var10a2" "x1a" "x1b" "x02" "x02a" > [7] "x02b" "y1a1" "y2" "y10" "y10a1" "y10a2" > [13] "y10a10" > >> t <- c("q10.1.1", "q10.2.1", "q2.1.1", "q10.10.2") > >> sort(t) > [1] "q10.1.1" "q10.10.2" "q10.2.1" "q2.1.1" > >> mysort(t) > [1] "q2.1.1" "q10.1.1" "q10.2.1" "q10.10.2" > > Here, sort() is the standard R function and mysort() is a replacement, which > sorts the names into the order that seems natural to me, at least in the > cases that I've tried: > > mysort <- function(x){ > sort.helper <- function(x){ > prefix <- strsplit(x, "[0-9]") > prefix <- sapply(prefix, "[", 1) > prefix[is.na(prefix)] <- "" > suffix <- strsplit(x, "[^0-9]") > suffix <- as.numeric(sapply(suffix, "[", 2)) > suffix[is.na(suffix)] <- -Inf > remainder <- sub("[^0-9]+", "", x) > remainder <- sub("[0-9]+", "", remainder) > if (all (remainder == "")) list(prefix, suffix) > else c(list(prefix, suffix), Recall(remainder)) > } > ord <- do.call("order", sort.helper(x)) > x[ord] > } > > I have a couple of applications in mind, one of which is recognizing > repeated-measures variables in "wide" longitudinal datasets, which often are > named in the form x1, x2, ... , xn. > > mysort(), which works by recursively slicing off pairs of non-digit and > digit strings, seems more complicated than it should have to be, and I > wonder whether anyone has a more elegant solution. I don't think that > efficiency is a serious issue for the applications I'm considering, but of > course a more efficient solution would be of interest. > > Thanks, > John > > ------------------------------ > John Fox, Professor > Department of Sociology > McMaster University > Hamilton, Ontario, Canada > web: socserv.mcmaster.ca/jfox > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.