Hi,
On 10/20/2017 12:53 PM, Peter Meissner wrote:
Thanks, for the explanation.
Still, I think this is surprising bahaviour which might be handled better.
Maybe a little surprising, but no more than:
> x <- sample(11L)
> sort(x)
[1] 1 2 3 4 5 6 7 8 9 10 11
> sort(as.character(x))
[1] "1" "10" "11" "2" "3" "4" "5" "6" "7" "8" "9"
The fact that sort(), as.factor(), split() and many other things behave
consistently with respect to the underlying order of character vectors
avoids other even bigger surprises.
Also note that the underlying order of character vectors actually
depends on your locale. One way to guarantee consistent results across
platforms/locales is by explicitly specifying the levels when making
a factor e.g.
f <- factor(x, levels=unique(x))
split(1:11, f)
This is particularly sensible when writing unit tests.
Cheers,
H.
Best, Peter
Am 20.10.2017 9:49 nachm. schrieb "Iñaki Úcar" <i.uca...@gmail.com>:
Hi Peter,
2017-10-20 21:33 GMT+02:00 Peter Meissner <retep.meiss...@gmail.com>:
Hey,
I found this - for me - quite surprising and puzzling behaviour of
split().
split(1:11, as.character(1:11))
split(1:11, 1:11)
When splitting by numerics everything works as expected - sorting of
input
== sorting of output -- but when using a character vector everything gets
re-sorted alphabetical.
Although, there are some references in the help files to what happens
when
using split, I did not find any note on this - for me - rather unexpected
behaviour.
As the documentation states,
f: a ‘factor’ in the sense that ‘as.factor(f)’ defines the
grouping, or a list of such factors in which case their
interaction is used for the grouping.
And, in fact,
as.factor(1:11)
[1] 1 2 3 4 5 6 7 8 9 10 11
Levels: 1 2 3 4 5 6 7 8 9 10 11
as.factor(as.character(1:11))
[1] 1 2 3 4 5 6 7 8 9 10 11
Levels: 1 10 11 2 3 4 5 6 7 8 9
Regards,
Iñaki
I would like it best when the sorting of split results stays the same no
matter the input (sorting of input == sorting of output)
If that is not possibly a note of caution in the help pages and maybe an
example might be valuable.
Best, Peter
[[alternative HTML version deleted]]
______________________________________________
R-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwIGaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=o5-lZT7zAjFNU8C0Z9D7XaQO_2NGmhKF-IbGZFhSvO0&s=4cZ9rSLJAVnnjULGMCDPAclXHoc9_le3Z1DrZg0nQqg&e=
[[alternative HTML version deleted]]
______________________________________________
R-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwIGaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=o5-lZT7zAjFNU8C0Z9D7XaQO_2NGmhKF-IbGZFhSvO0&s=4cZ9rSLJAVnnjULGMCDPAclXHoc9_le3Z1DrZg0nQqg&e=
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpa...@fredhutch.org
Phone: (206) 667-5791
Fax: (206) 667-1319
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel