Another possibility is to use strapply in gsubfn giving a solution that is non-recursive and shorter:
library(gsubfn) mysort2 <- function(s) { L <- strapply(s, "([0-9]+)|([^0-9]+)", ~ if (nchar(x)) sprintf("%9d", as.numeric(x)) else y) L2 <- t(do.call(cbind, lapply(L, ts))) L3 <- replace(L2, is.na(L2), "") ord <- do.call(order, as.data.frame(L3, stringsAsFactors = FALSE)) s[ord] } First strapply breaks up each string into a character vector of the numeric and non-numeric components. We pad each numeric component on the left with spaces using sprintf so they are all 9 wide. The next line turns that into a matrix L2 and then we replace the NAs giving L3. Finally we order it and apply the ordering, ord, to get the sorted version. The gsubfn home page is at: http://gsubfn.googlecode.com Here is some sample output: > mysort2(s) [1] "var2" "var10a2" "x1a" "x1b" "x02" "x02a" "x02b" "y1a1" "y2" "y10" "y10a1" "y10a2" "y10a10" > mysort(s) [1] "var2" "var10a2" "x1a" "x1b" "x02" "x02a" "x02b" "y1a1" "y2" "y10" "y10a1" "y10a2" "y10a10" > mysort2(t) [1] "q2.1.1" "q10.1.1" "q10.2.1" "q10.10.2" > mysort(t) [1] "q2.1.1" "q10.1.1" "q10.2.1" "q10.10.2" On Sun, Dec 21, 2008 at 9:57 PM, John Fox <j...@mcmaster.ca> wrote: > Dear Gabor, > > Thanks for this -- I was unaware of mixedsort(). As you point out, > however, mixedsort() doesn't cover all of the cases in which I'm > interested and which are handled by mysort(). > > Regards, > John > > On Sun, 21 Dec 2008 20:51:17 -0500 > "Gabor Grothendieck" <ggrothendi...@gmail.com> wrote: >> mixedsort in gtools will give the same result as mysort(s) but >> differs in the case of t. >> >> On Sun, Dec 21, 2008 at 8:33 PM, John Fox <j...@mcmaster.ca> wrote: >> > Dear r-helpers, >> > >> > I'm looking for a way of sorting variable names in a "natural" >> order, when >> > the names are composed of digits and other characters. I know that >> this is a >> > vague idea, and that sorting character strings is a complex topic, >> but >> > perhaps a couple of examples will clarify what I mean: >> > >> >> s <- c("x1b", "x1a", "x02b", "x02a", "x02", "y1a1", "y10a2", >> > + "y10a10", "y10a1", "y2", "var10a2", "var2", "y10") >> > >> >> sort(s) >> > [1] "var10a2" "var2" "x02" "x02a" "x02b" "x1a" >> > [7] "x1b" "y10" "y10a1" "y10a10" "y10a2" "y1a1" >> > [13] "y2" >> > >> >> mysort(s) >> > [1] "var2" "var10a2" "x1a" "x1b" "x02" "x02a" >> > [7] "x02b" "y1a1" "y2" "y10" "y10a1" "y10a2" >> > [13] "y10a10" >> > >> >> t <- c("q10.1.1", "q10.2.1", "q2.1.1", "q10.10.2") >> > >> >> sort(t) >> > [1] "q10.1.1" "q10.10.2" "q10.2.1" "q2.1.1" >> > >> >> mysort(t) >> > [1] "q2.1.1" "q10.1.1" "q10.2.1" "q10.10.2" >> > >> > Here, sort() is the standard R function and mysort() is a >> replacement, which >> > sorts the names into the order that seems natural to me, at least >> in the >> > cases that I've tried: >> > >> > mysort <- function(x){ >> > sort.helper <- function(x){ >> > prefix <- strsplit(x, "[0-9]") >> > prefix <- sapply(prefix, "[", 1) >> > prefix[is.na(prefix)] <- "" >> > suffix <- strsplit(x, "[^0-9]") >> > suffix <- as.numeric(sapply(suffix, "[", 2)) >> > suffix[is.na(suffix)] <- -Inf >> > remainder <- sub("[^0-9]+", "", x) >> > remainder <- sub("[0-9]+", "", remainder) >> > if (all (remainder == "")) list(prefix, suffix) >> > else c(list(prefix, suffix), Recall(remainder)) >> > } >> > ord <- do.call("order", sort.helper(x)) >> > x[ord] >> > } >> > >> > I have a couple of applications in mind, one of which is >> recognizing >> > repeated-measures variables in "wide" longitudinal datasets, which >> often are >> > named in the form x1, x2, ... , xn. >> > >> > mysort(), which works by recursively slicing off pairs of non-digit >> and >> > digit strings, seems more complicated than it should have to be, >> and I >> > wonder whether anyone has a more elegant solution. I don't think >> that >> > efficiency is a serious issue for the applications I'm considering, >> but of >> > course a more efficient solution would be of interest. >> > >> > Thanks, >> > John >> > >> > ------------------------------ >> > John Fox, Professor >> > Department of Sociology >> > McMaster University >> > Hamilton, Ontario, Canada >> > web: socserv.mcmaster.ca/jfox >> > >> > ______________________________________________ >> > R-help@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > > > -------------------------------- > John Fox, Professor > Department of Sociology > McMaster University > Hamilton, Ontario, Canada > http://socserv.mcmaster.ca/jfox/ > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.