Hi guys, I am working on some code for automatically translating recursive functions into looping functions to implemented tail-recursion optimisations. See https://github.com/mailund/tailr
As a toy-example, consider the factorial function factorial <- function(n, acc = 1) { if (n <= 1) acc else factorial(n - 1, acc * n) } I can automatically translate this into the loop-version factorial_tr_1 <- function (n, acc = 1) { repeat { if (n <= 1) return(acc) else { .tailr_n <- n - 1 .tailr_acc <- acc * acc n <- .tailr_n acc <- .tailr_acc next } } } which will run faster and not have problems with recursion depths. However, I’m not entirely happy with this version for two reasons: I am not happy with introducing the temporary variables and this rewrite will not work if I try to over-scope an evaluation context. I have two related questions, one related to parallel assignments — i.e. expressions to variables so the expression uses the old variable values and not the new values until the assignments are all done — and one related to restarting a loop from nested loops or from nested expressions in `with` expressions or similar. I can implement parallel assignment using something like rlang::env_bind: factorial_tr_2 <- function (n, acc = 1) { .tailr_env <- rlang::get_env() repeat { if (n <= 1) return(acc) else { rlang::env_bind(.tailr_env, n = n - 1, acc = acc * n) next } } } This reduces the number of additional variables I need to one, but is a couple of orders of magnitude slower than the first version. > microbenchmark::microbenchmark(factorial(100), + factorial_tr_1(100), + factorial_tr_2(100)) Unit: microseconds expr min lq mean median uq max neval factorial(100) 53.978 60.543 77.76203 71.0635 85.947 180.251 100 factorial_tr_1(100) 9.022 9.903 11.52563 11.0430 11.984 28.464 100 factorial_tr_2(100) 5870.565 6109.905 6534.13607 6320.4830 6756.463 8177.635 100 Is there another way to do parallel assignments that doesn’t cost this much in running time? My other problem is the use of `next`. I would like to combine tail-recursion optimisation with pattern matching as in https://github.com/mailund/pmatch where I can, for example, define a linked list like this: devtools::install_github("mailund/pmatch”) library(pmatch) llist := NIL | CONS(car, cdr : llist) and define a function for computing the length of a list like this: list_length <- function(lst, acc = 0) { force(acc) cases(lst, NIL -> acc, CONS(car, cdr) -> list_length(cdr, acc + 1)) } The `cases` function creates an environment that binds variables in a pattern-description that over-scopes the expression to the right of `->`, so the recursive call in this example have access to the variables `cdr` and `car`. I can transform a `cases` call to one that creates the environment containing the bound variables and then evaluate this using `eval` or `with`, but in either case, a call to `next` will not work in such a context. The expression will be evaluated inside `bind` or `with`, and not in the `list_lenght` function. A version that *will* work, is something like this factorial_tr_3 <- function (n, acc = 1) { .tailr_env <- rlang::get_env() .tailr_frame <- rlang::current_frame() repeat { if (n <= 1) rlang::return_from(.tailr_frame, acc) else { rlang::env_bind(.tailr_env, n = n - 1, acc = acc * n) rlang::return_to(.tailr_frame) } } } Here, again, for the factorial function since this is easier to follow than the list-length function. This solution will also work if you return values from inside loops, where `next` wouldn’t work either. Using `rlang::return_from` and `rlang::return_to` implements the right semantics, but costs me another order of magnitude in running time. microbenchmark::microbenchmark(factorial(100), factorial_tr_1(100), factorial_tr_2(100), factorial_tr_3(100)) Unit: microseconds expr min lq mean median uq max neval factorial(100) 52.479 60.2640 93.43069 67.5130 83.925 2062.481 100 factorial_tr_1(100) 8.875 9.6525 49.19595 10.6945 11.217 3818.823 100 factorial_tr_2(100) 5296.350 5525.0745 5973.77664 5737.8730 6260.128 8471.301 100 factorial_tr_3(100) 77554.457 80757.0905 87307.28737 84004.0725 89859.169 171039.228 100 I can live with the “introducing extra variables” solution to parallel assignment, and I could hack my way out of using `with` or `bind` in rewriting `cases`, but restarting a `repeat` loop would really make for a nicer solution. I know that `goto` is considered harmful, but really, in this case, it is what I want. A `callCC` version also solves the problem factorial_tr_4 <- function(n, acc = 1) { function_body <- function(continuation) { if (n <= 1) { continuation(acc) } else { continuation(list("continue", n = n - 1, acc = acc * n)) } } repeat { result <- callCC(function_body) if (is.list(result) && result[[1]] == "continue") { n <- result$n acc <- result$acc next } else { return(result) } } } But this requires that I know how to distinguish between a valid return value and a tag for “next” and is still a lot slower than the `next` solution microbenchmark::microbenchmark(factorial(100), factorial_tr_1(100), factorial_tr_2(100), factorial_tr_3(100), factorial_tr_4(100)) Unit: microseconds expr min lq mean median uq max neval factorial(100) 54.109 61.8095 81.33167 81.8785 89.748 243.554 100 factorial_tr_1(100) 9.025 9.9035 11.38607 11.1990 12.008 22.375 100 factorial_tr_2(100) 5272.524 5798.3965 6302.40467 6077.7180 6492.959 9967.237 100 factorial_tr_3(100) 66186.080 72336.2810 76480.75172 73632.9665 75405.054 203785.673 100 factorial_tr_4(100) 270.978 302.7890 337.48763 313.9930 334.096 1425.702 100 I don’t necessarily need the tail-recursion optimisation to be faster than the recursive version; just getting out of the problem of too deep recursions is a benefit, but I would rather not pay with an order of magnitude for it. I could, of course, try to handle cases that works with `next` in one way, and other cases using `callCC`, but I feel it should be possible with a version that handles all cases the same way. Is there any way to achieve this? Cheers Thomas ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.