I was trying to get an interactive R prompt with the current working directory.
I reviewed R source 'main.c' and 'options.c', and saw that a 20 char buffer is
used when in Browse debugging mode, but that no other validation is done on the
length of the prompt option.
This hangs R, or takes extremely long to return:
# R --vanilla
big <- paste(sample(LETTERS, size = 1e7, replace = TRUE), collapse = "")
options(prompt = big)
Running R with gdb and interrupting to get backtraces shows that 'pushReadLine'
in 'unix/sys-std.c' results in a chain of libreadline calls, including, in my
case at least, UTF-8 and a lot of __strlen_avx2 activity. 'R_PromptString' in
'main.c' should check prompt is a reasonable length, as well as a check when
setting the prompt in 'options.c'. This may be a readline bug, too? I watched
it do nothing for a while, it didn't seem to accumulate much or any new memory
while watching 'top', but did max one core of CPU.
> sessionInfo()
R version 3.5.3 (2019-03-11)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 19.04
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.3.5.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_3.5.3
>
I've searched R-devel and see minimal discussion of security threats in R. Has
anybody fuzzed R with data or source files? As R grows in popularity, I hope
there is some pro-active security work going on, which I understand may not
always best be done on a public mailing list.
Jack Wasey
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel