[Rd] Why is as.function() slower than eval(call("function"())?

2017-08-04 Thread Gregory Werbin
(Apologies if this is better suited for R-help.)

On my system (macOS Sierra, late 2014 MacBook Pro; R 3.4.1, Homebrew build), I 
found that it is faster to construct a function using eval(call("function", 
...)) than using as.function(list(...)). Example:

make_fn_1 <- function(a, b) eval(call("function", a, b), env = 
parent.frame())
make_fn_2 <- function(a, b) as.function(c(a, list(b)), env = parent.frame())

a <- as.pairlist(alist(x = , y = ))
b <- quote(x + y)

library("microbenchmark")
microbenchmark(make_fn_1(a, b), make_fn_2(a, b))

# Unit: microseconds
# expr   min lqmean median uqmax neval cld
#  make_fn_1(a, b) 1.671 1.8855 2.13297  2.039 2.1950  9.852   100  a
#  make_fn_2(a, b) 3.541 3.7230 4.13400  3.906 4.1055 23.153   100   b

At first I thought the gap was due to the overhead of calling c(a, list(b)). 
But this turns out not to be the case:

make_fn_weird <- function(a, b) as.function(c(a, b), env = parent.frame())
b_wrapped <- list(b)

make_fn_weirder <- function(a_b) as.function(a_b, env = parent.frame())
a_b <- c(a, b_wrapped)

microbenchmark(make_fn_1(a, b), make_fn_2(a, b),
   make_fn_weird(a, b_wrapped), make_fn_weirder(a_b))

# Unit: microseconds
# expr   min lqmean median uqmax 
neval cld
#  make_fn_1(a, b) 1.718 1.8990 2.12119 1.9860 2.1605  8.057   
100 a
#  make_fn_2(a, b) 3.393 3.5865 4.03029 3.6655 3.9615 27.499   
100   c
#  make_fn_weird(a, b_wrapped) 3.354 3.5005 3.77190 3.6405 3.9425  6.839   
100   c
# make_fn_weirder(a_b) 2.488 2.6290 2.83352 2.7215 2.8800  7.007   
100  b

One IRC user pointed out that as.function() takes its own path through the 
code, namely do_asfunction() (in src/main/coerce.c). What is it about this code 
path that's 50% slower than whatever happens during eval(call("function", a, 
b))?

Obviously this is a trivial micro-optimization and it doesn't matter to 99% of 
users. Mostly asking out of curiosity, but also wondering if there's a more 
general lesson to be learned here.

Thanks!
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Why is as.function() slower than eval(call("function"())?

2017-08-04 Thread Joshua Ulrich
On Thu, Aug 3, 2017 at 11:32 PM, Gregory Werbin
 wrote:
> (Apologies if this is better suited for R-help.)
>
> On my system (macOS Sierra, late 2014 MacBook Pro; R 3.4.1, Homebrew build), 
> I found that it is faster to construct a function using eval(call("function", 
> ...)) than using as.function(list(...)). Example:
>
> make_fn_1 <- function(a, b) eval(call("function", a, b), env = 
> parent.frame())
> make_fn_2 <- function(a, b) as.function(c(a, list(b)), env = 
> parent.frame())
>
> a <- as.pairlist(alist(x = , y = ))
> b <- quote(x + y)
>
> library("microbenchmark")
> microbenchmark(make_fn_1(a, b), make_fn_2(a, b))
>
> # Unit: microseconds
> # expr   min lqmean median uqmax neval cld
> #  make_fn_1(a, b) 1.671 1.8855 2.13297  2.039 2.1950  9.852   100  a
> #  make_fn_2(a, b) 3.541 3.7230 4.13400  3.906 4.1055 23.153   100   b
>
> At first I thought the gap was due to the overhead of calling c(a, list(b)). 
> But this turns out not to be the case:
>
> make_fn_weird <- function(a, b) as.function(c(a, b), env = parent.frame())
> b_wrapped <- list(b)
>
> make_fn_weirder <- function(a_b) as.function(a_b, env = parent.frame())
> a_b <- c(a, b_wrapped)
>
> microbenchmark(make_fn_1(a, b), make_fn_2(a, b),
>make_fn_weird(a, b_wrapped), make_fn_weirder(a_b))
>
> # Unit: microseconds
> # expr   min lqmean median uqmax 
> neval cld
> #  make_fn_1(a, b) 1.718 1.8990 2.12119 1.9860 2.1605  8.057  
>  100 a
> #  make_fn_2(a, b) 3.393 3.5865 4.03029 3.6655 3.9615 27.499  
>  100   c
> #  make_fn_weird(a, b_wrapped) 3.354 3.5005 3.77190 3.6405 3.9425  6.839  
>  100   c
> # make_fn_weirder(a_b) 2.488 2.6290 2.83352 2.7215 2.8800  7.007  
>  100  b
>
> One IRC user pointed out that as.function() takes its own path through the 
> code, namely do_asfunction() (in src/main/coerce.c). What is it about this 
> code path that's 50% slower than whatever happens during 
> eval(call("function", a, b))?
>
> Obviously this is a trivial micro-optimization and it doesn't matter to 99% 
> of users. Mostly asking out of curiosity, but also wondering if there's a 
> more general lesson to be learned here.
>
Agreed that this is minor (~2us), but the majority of the difference
seems to be from S3 method dispatch.  as.function() is generic and has
to dispatch to as.function.default().  The times are very similar if
you call the method directly.

R> make_fn_3 <- function(a, b) as.function.default(c(a, list(b)), env
= parent.frame())
R> microbenchmark(make_fn_1(a, b), make_fn_2(a, b), make_fn_3(a, b))
Unit: microseconds
expr   min lq mean medianuq  max neval
 make_fn_1(a, b) 1.615 1.7595 12.78339 1.9115 2.145 1077.657   100
 make_fn_2(a, b) 3.077 3.3390 19.89423 3.5215 3.862 1589.505   100
 make_fn_3(a, b) 1.629 1.7975 15.40389 1.9505 2.227 1335.306   100

Now the difference is <100ns, which is much harder to investigate.

> Thanks!
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Joshua Ulrich  |  about.me/joshuaulrich
FOSS Trading  |  www.fosstrading.com
R/Finance 2017 | www.rinfinance.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Why is as.function() slower than eval(call("function"())?

2017-08-04 Thread Duncan Murdoch

On 04/08/2017 12:32 AM, Gregory Werbin wrote:

(Apologies if this is better suited for R-help.)

On my system (macOS Sierra, late 2014 MacBook Pro; R 3.4.1, Homebrew build), I found that 
it is faster to construct a function using eval(call("function", ...)) than 
using as.function(list(...)). Example:

make_fn_1 <- function(a, b) eval(call("function", a, b), env = 
parent.frame())
make_fn_2 <- function(a, b) as.function(c(a, list(b)), env = parent.frame())

a <- as.pairlist(alist(x = , y = ))
b <- quote(x + y)

library("microbenchmark")
microbenchmark(make_fn_1(a, b), make_fn_2(a, b))

# Unit: microseconds
# expr   min lqmean median uqmax neval cld
#  make_fn_1(a, b) 1.671 1.8855 2.13297  2.039 2.1950  9.852   100  a
#  make_fn_2(a, b) 3.541 3.7230 4.13400  3.906 4.1055 23.153   100   b

At first I thought the gap was due to the overhead of calling c(a, list(b)). 
But this turns out not to be the case:

make_fn_weird <- function(a, b) as.function(c(a, b), env = parent.frame())
b_wrapped <- list(b)

make_fn_weirder <- function(a_b) as.function(a_b, env = parent.frame())
a_b <- c(a, b_wrapped)

microbenchmark(make_fn_1(a, b), make_fn_2(a, b),
   make_fn_weird(a, b_wrapped), make_fn_weirder(a_b))

# Unit: microseconds
# expr   min lqmean median uqmax 
neval cld
#  make_fn_1(a, b) 1.718 1.8990 2.12119 1.9860 2.1605  8.057   
100 a
#  make_fn_2(a, b) 3.393 3.5865 4.03029 3.6655 3.9615 27.499   
100   c
#  make_fn_weird(a, b_wrapped) 3.354 3.5005 3.77190 3.6405 3.9425  6.839   
100   c
# make_fn_weirder(a_b) 2.488 2.6290 2.83352 2.7215 2.8800  7.007   
100  b

One IRC user pointed out that as.function() takes its own path through the code, namely 
do_asfunction() (in src/main/coerce.c). What is it about this code path that's 50% slower 
than whatever happens during eval(call("function", a, b))?

Obviously this is a trivial micro-optimization and it doesn't matter to 99% of 
users. Mostly asking out of curiosity, but also wondering if there's a more 
general lesson to be learned here.


The main difference is that `function` is a primitive, while 
as.function() is a generic.  You will get much closer timing if you skip 
the method dispatch by calling as.function.default() directly.


The next part of the difference is that as.function.default is a regular 
R closure:


as.function.default <- function (x, envir = parent.frame(), ...)
if (is.function(x)) x else .Internal(as.function.default(x, envir))

If I skip the is.function(x) test and call .Internal directly, I find it 
is about 10% faster than `function`.  But that is an extremely risky 
optimization; it wouldn't be accepted in a CRAN package.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] `c` with lists with "bytes" names encoding

2017-08-04 Thread brodie gaslam via R-devel
I'm not entirely sure this even qualifies as a bug given how unusual a case it 
is:
> x <- list('a')
> name.x <- '\x81'
> Encoding(name.x) <- 'bytes'
> names(x) <- name.x
> x
$`\\x81`[1] "a"> c(x)Error: translating strings with "bytes" encoding is not 
allowed
> unlist(x)
Error in unlist(x) : 
  translating strings with "bytes" encoding is not allowed> letters[name.x]
[1] NA
> letters[[name.x]]
Error: translating strings with "bytes" encoding is not allowed
Presumably the error is coming from `translateCharsUTF8` or similar.  I imagine 
the "fix" would be to not translate byte encoded names, or not allow them.

This is not really a problem for me as I can easily work around it.  Just an 
fyi in case it is of interest to you.

Best regards,

Brodie.
P.S.: SessionInfo():
R version 3.4.0 (2017-04-21)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.6

Matrix products: default
BLAS: 
/Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: 
/Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

other attached packages:
[1] diffobj_0.1.6.9002  unitizer_1.4.2.9003

loaded via a namespace (and not attached):
[1] vetr_0.1.0.9000 compiler_3.4.0  tools_3.4.0 withr_1.0.2
[5] crayon_1.3.2memoise_1.1.0   digest_0.6.12   devtools_1.12.0

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel