On Sun, Nov 23, 2014 at 4:07 PM, Henrik Bengtsson <h...@biostat.ucsf.edu> wrote: > On Sat, Nov 15, 2014 at 1:47 AM, Duncan Murdoch > <murdoch.dun...@gmail.com> wrote: >> >> On 14/11/2014, 9:06 PM, Henrik Bengtsson wrote: >> > I've meant to ask the following for several years now. I understand why: >> > >> >> foo <- function(x, dim=dim) { dim } >> >> foo(1) >> > Error in foo(1) : >> > promise already under evaluation: recursive default argument >> > reference or earlier problems? >> > >> > gives an error, but why wouldn't/couldn't the following work? >> > >> >> foo <- function(x, dim=dim(x)) { dim } >> >> foo(1) >> > Error in foo(1) : >> > promise already under evaluation: recursive default argument >> > reference or earlier problems? >> >> You refer to "dim". There's a dim defined in the argument list, so R >> uses that definition of it. >> >> But you didn't supply any value, so it tries to evaluate the default >> value. Default expressions are always evaluated in the evaluation frame >> of the function call, so it looks for a function named "dim" in the >> local frame. >> >> It finds the argument in the local frame, so it tries to figure out if >> it is a function or a value. It needs to evaluate it to do that, and >> you get the recursion. >> >> > >> > As a workaround I also tried: >> > >> >> foo <- function(x, dim) { if (missing(dim)) dim <- dim(x); dim } >> >> foo(1) >> > Error in foo(1) : argument "dim" is missing, with no default >> > >> > which surprised me too. >> > >> > >> > For the first case, is the rationale related to: >> > >> >> foo <- function(x, a=dim(x), dim) { a } >> >> foo(1) >> > Error in foo(1) : argument "dim" is missing, with no default >> > >> > and >> > >> >> foo <- function(x, a=dim(x), dim=a) { a } >> >> foo(1) >> > Error in foo(1) : >> > promise already under evaluation: recursive default argument >> > reference or earlier problems? >> > >> > [since here argument 'dim' could take a function, e.g. foo(1, >> > dim=length)], and that R treats >> > >> > foo <- function(x, dim=dim(x)) { dim } >> > >> > in a similar way? That is, is R not "clever" enough to detect this as >> > a special case, but instead goes ahead and tries to evaluate the >> > default expression (=dim(x)) of argument 'dim' in order to get its >> > default value? If so, is there anything preventing R from support >> > this "special case", e.g. by evaluating the default expression without >> > argument/symbol 'dim' itself being in the picture to avoid "it finds >> > itself"? (Sorry if I'm using the incorrect words here). >> >> No, it shouldn't do that. It should use consistent rules for evaluation >> or there would be sure to be bugs. >> >> > >> > Yes, I understand that I can do: >> > >> >> foo <- function(x, dim=base::dim(x)) { dim } >> >> This is what you should do. >> >> >> foo(1) >> > NULL >> > >> >> foo <- function(x, dim=NULL) { if (is.null(dim)) dim <- dim(x); dim } >> >> This works, because when R is looking up the function dim(), it can >> evaluate the local argument dim and see it is not a function, so it >> proceeds to the parent frame. >> >> >> foo(1) >> > NULL >> > >> > or >> > >> >> foo <- function(x, dim.=dim(x)) { dim. } >> >> foo(1) >> > NULL >> >> This is another solution that works, but it has the ugly argument name >> now, so you'll get warnings during package checks from calls like >> >> foo(1, dim=2) >> >> > >> > but I would prefer not to have to turn those rather ad hoc solutions in my >> > code. >> >> Nothing ad hoc about the first one. > > Thanks for the feedback. I agree that base::dim(x) is clean and > clear, but unfortunately there is a ~500 times overhead in using '::'. > Since I went through the effort of doing the benchmarking and find > faster solutions, I'm sharing the following: > >> library("microbenchmark") > >> x <- matrix(1:(80*80), nrow=80) > >> # Not "legal", because it calls .Primitive(). >> dim_illegal <- base::dim > >> dim_R <- function(x) { > + ns <- getNamespace("base") > + dim <- get("dim", envir=ns, inherits=FALSE, mode="function") > + dim(x) > + } > >> dim_R_memoized <- local({ > + dim <- NULL > + function(x) { > + if (is.null(dim)) { > + dim <<- get("dim", envir=getNamespace("base"), inherits=FALSE, > mode="function") > + } > + dim(x) > + } > + }) > >> stats <- microbenchmark( > + dim(x), > + base::dim(x), > + dim_R(x), > + dim_R_memoized(x), > + dim_illegal(x), > + sum(x), > + unit="ns", > + times=10e3 > + ) > Warning message: > In microbenchmark(dim(x), base::dim(x), dim_R(x), dim_R_memoized(x), : > Could not measure a positive execution time for 3859 evaluations. > >> print(stats) > Unit: nanoseconds > expr min lq mean median uq max neval cld > dim(x) 0 0 25.2226 1 1 10780 10000 a > base::dim(x) 6545 7700 10429.0165 8470 12897 2678155 10000 e > dim_R(x) 3080 3851 5163.8612 4236 6545 55435 10000 c > dim_R_memoized(x) 385 771 1238.8292 1156 1541 44656 10000 b > dim_illegal(x) 0 1 51.4421 1 1 5775 10000 a > sum(x) 8085 8470 9590.9570 8470 10395 49660 10000 d > > Yes, yes, the extra cost of using base::dim(x) is only ~10 us, but if > you do, say, a million bootstrap samples calling this function, that's > an extra unnecessary 10 seconds of processing time. As a comparison, > the overhead is roughly the same as summing 6400 integers. > > For workarounds, I considered: > > (a) dim_illegal > (b) dim_R > (c) dim_R_memoized > > where, > > (a) would be "good enough", but can immediately be discarded because > if used in a package, it will create a copy of base::dim and thereby > call .Primitive() immediately, which is unsafe. > > (b) is a poor-mans version of try to cut the corners of '::', but > there is still a substantial overhead in each call, but still a 25-50% > speedup compared to '::'. > > (c) is a smarter version of (b) that does the look up only ones, and > managed to reduce the overhead to 10% of '::'. It's still 50 times > the overhead of a direct dim(x) call. > > Since one can byte compile packages (ByteCompile: TRUE in > DESCRIPTION), I've also played around with compiler::cmpfun() and that > prunes off about 10% of the non-compiled ditto. I was > somewhat/naively hoping that the compiler would be able to compile > base::dim into a "constant", but that doesn't seem to be the case. > > > BTW, is the following, which is ~2 times as fast as dim_R_memoized(), > valid in an R package? Will it set the local 'dim' variable when the > package is loaded, which I assume is safe/legal, or before? I didn't > include it above, because I wasn't sure it was safe/valid. > > dim_R_memoized_2 <- local({ > dim_local <- base::dim > function(x) dim_local(x) > })
The above question still stands, but otherwise, I overlooked the most obvious solution: dim_1 <- function(x) dim(x) which is favorable when benchmarked (~10 times slower than a direct dim(x) call but otherwise the one of the fastest solutions): Unit: nanoseconds expr min lq mean median uq max neval cld dim(x) 0 1 72.941 1 1 2696 1000 a base::dim(x) 11549 13474 15105.950 14245 15399 60824 1000 c dim_1(x) 1 771 2801.544 771 1156 1806225 1000 a dim_R(x) 5390 6930 8077.753 7315 8085 249069 1000 b dim_R_memoized(x) 1156 1926 2520.119 2310 2695 73528 1000 a dim_R_memoized_2(x) 385 771 1089.243 771 1156 20019 1000 a dim_illegal(x) 0 1 161.480 1 386 2311 1000 a sum(x) 10395 15784 16459.454 15785 16169 114333 1000 c So, my best shot on the original problem would now be to either use: dim2 <- function(x) dim(x) foo <- function(x, dim=dim2(x)) { dim } or simply avoid the name clash via: foo <- function(x, dimx=dim(x)) { dimx } /Henrik > > > Thanks, > > /Henrik > > >> >> Duncan Murdoch >> >> > >> > >> > Thanks, >> > >> > Henrik >> > >> > ______________________________________________ >> > R-devel@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-devel >> > >> ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel