Thank you all for your input - most appreciated. Best, Peter
Am 21.10.2017 07:35 schrieb "Rui Barradas" <ruipbarra...@sapo.pt>: > Hello, > > In order to solve that problem of sorting numerics made characters there > is package stringr, functions str_sort and str_order. > > library(stringr) > > set.seed(2447) > > x <- sample(11L) > sort(as.character(x)) > [1] "1" "10" "11" "2" "3" "4" "5" "6" "7" "8" "9" > > str_sort(as.character(x), numeric = TRUE) > [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" > > str_order(as.character(x), numeric = TRUE) > #[1] 1 4 11 8 6 5 3 10 9 7 2 > > i <- str_order(as.character(x), numeric = TRUE) > as.character(x)[i] > #[1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" > > > Unfortunately this does not solve the OP's question, factor(), > as.factor(), split() and others use the base R sorter and this can only be > changed by changing their sources. > > Hope this helps, > > Rui Barradas > > Em 21-10-2017 00:32, Hervé Pagès escreveu: > >> Hi, >> >> On 10/20/2017 12:53 PM, Peter Meissner wrote: >> >>> Thanks, for the explanation. >>> >>> Still, I think this is surprising bahaviour which might be handled >>> better. >>> >> >> Maybe a little surprising, but no more than: >> >> > x <- sample(11L) >> >> > sort(x) >> [1] 1 2 3 4 5 6 7 8 9 10 11 >> >> > sort(as.character(x)) >> [1] "1" "10" "11" "2" "3" "4" "5" "6" "7" "8" "9" >> >> The fact that sort(), as.factor(), split() and many other things behave >> consistently with respect to the underlying order of character vectors >> avoids other even bigger surprises. >> >> Also note that the underlying order of character vectors actually >> depends on your locale. One way to guarantee consistent results across >> platforms/locales is by explicitly specifying the levels when making >> a factor e.g. >> >> f <- factor(x, levels=unique(x)) >> split(1:11, f) >> >> This is particularly sensible when writing unit tests. >> >> Cheers, >> H. >> >> >>> Best, Peter >>> >>> Am 20.10.2017 9:49 nachm. schrieb "Iñaki Úcar" <i.uca...@gmail.com>: >>> >>> Hi Peter, >>>> >>>> 2017-10-20 21:33 GMT+02:00 Peter Meissner <retep.meiss...@gmail.com>: >>>> >>>>> Hey, >>>>> >>>>> I found this - for me - quite surprising and puzzling behaviour of >>>>> >>>> split(). >>>> >>>>> >>>>> >>>>> split(1:11, as.character(1:11)) >>>>> split(1:11, 1:11) >>>>> >>>>> >>>>> When splitting by numerics everything works as expected - sorting of >>>>> >>>> input >>>> >>>>> == sorting of output -- but when using a character vector everything >>>>> gets >>>>> re-sorted alphabetical. >>>>> >>>>> >>>>> Although, there are some references in the help files to what happens >>>>> >>>> when >>>> >>>>> using split, I did not find any note on this - for me - rather >>>>> unexpected >>>>> behaviour. >>>>> >>>> >>>> As the documentation states, >>>> >>>> f: a ‘factor’ in the sense that ‘as.factor(f)’ defines the >>>> grouping, or a list of such factors in which case their >>>> interaction is used for the grouping. >>>> >>>> And, in fact, >>>> >>>> as.factor(1:11) >>>>> >>>> [1] 1 2 3 4 5 6 7 8 9 10 11 >>>> Levels: 1 2 3 4 5 6 7 8 9 10 11 >>>> >>>> as.factor(as.character(1:11)) >>>>> >>>> [1] 1 2 3 4 5 6 7 8 9 10 11 >>>> Levels: 1 10 11 2 3 4 5 6 7 8 9 >>>> >>>> Regards, >>>> Iñaki >>>> >>>> I would like it best when the sorting of split results stays the >>>>> same no >>>>> matter the input (sorting of input == sorting of output) >>>>> >>>>> If that is not possibly a note of caution in the help pages and >>>>> maybe an >>>>> example might be valuable. >>>>> >>>>> >>>>> Best, Peter >>>>> >>>>> [[alternative HTML version deleted]] >>>>> >>>>> ______________________________________________ >>>>> R-devel@r-project.org mailing list >>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.et >>>>> hz.ch_mailman_listinfo_r-2Ddevel&d=DwIGaQ&c=eRAMFD45gAfqt84V >>>>> tBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=o5-lZ >>>>> T7zAjFNU8C0Z9D7XaQO_2NGmhKF-IbGZFhSvO0&s=4cZ9rSLJAVnnjULGMCD >>>>> PAclXHoc9_le3Z1DrZg0nQqg&e= >>>>> >>>>> >>>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-devel@r-project.org mailing list >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.et >>> hz.ch_mailman_listinfo_r-2Ddevel&d=DwIGaQ&c=eRAMFD45gAfqt84V >>> tBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=o5-lZ >>> T7zAjFNU8C0Z9D7XaQO_2NGmhKF-IbGZFhSvO0&s=4cZ9rSLJAVnnjULGMCD >>> PAclXHoc9_le3Z1DrZg0nQqg&e= >>> >>> >>> >> [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel