Re: [Rd] environment question
On 10-12-26 4:30 PM, Paul Johnson wrote: > Hello, everybody. > > I'm putting together some lecture notes and course exercises on R > programming. My plan is to pick some R packages, ask students to read > through code and see why things work, maybe make some changes. As I > look for examples, I'm running up against the problem that packages > use coding idioms that are unfamiliar to me. > > A difficult thing for me is explaining scope of variables in R > functions. When should we pass an object to a function, when should > we let the R system search about for an object? I've been puzzling > through ?environment for quite a while. Take a look at the Language Definition, not just the ?environment page. > > Here's an example from one of the packages that I like, called "ltm". > In the function "ltm.fit" the work of calculating estimates is sent to > different functions like "EM' and "loglikltm" and "scoreltm". Before > that, this is used: > > environment(EM)<- environment(loglikltm)<- environment(scoreltm)<- > environment() > > ##and then EM is called > res.EM<- EM(betas, constraint, control$iter.em, control$verbose) > > I want to make sure I understand this. The environment line gets the > current environment and then assigns it for those 3 functions, right? > All variables and functions that can be accessed from the current > position in the code become available to function EM, loglikltm, > scoreltm. That's one way to think of it, but it is slightly more accurate to say that three new functions are created, whose associated environments are set to the current environment. > > So, which options should be explicitly inserted into a function call, > which should be left in the environment for R to find when it needs > them? That's a matter of style. I would say that it is usually better style not to mess around with a function's environment. > > 1. I *think* that when EM is called, the variables "betas", > "constraint", and "control" are already in the environment. That need not be true, as long as they are in the environment by the time EM, loglikltm, scoreltm are called. > > The EM function is declared like this, using the same words "beta" and > "constraint" > > EM<- > function (betas, constraint, iter, verbose = FALSE) { > > It seems to me that if I wrote the function call like this (leave out > "betas" and "constraint") > > res.EM<- EM(control$iter.em, control$verbose) > > R will run EM and go find "betas" and "constraint" in the environment, > there was no need to name them as arguments. Including them as arguments means that new local copies will be created in the evaluation frame. > > > 2 Is a function like EM allowed to alter objects that it finds through > the environment, ones that are not passed as arguments? I understand > that a function cannot alter an object that is passed explicitly, but > what about the ones it grabs from the environment? Yes it's allowed, but the usual rules of assignment won't do it. Read about the <<- operator for modifying things that are not local. In summary: beta <- 1 creates or modifies a new local variable, while beta <<- 1 goes looking for beta, and modifies the first one it finds. If it fails to find one, it creates one in the global environment. Duncan Murdoch > If you have ideas about packages that might be handy teaching > examples, please let me know. > > pj __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] minor problem in strsplit help file
> "PatB" == Patrick Burns > on Fri, 24 Dec 2010 10:27:23 + writes: PatB> The 'extended' argument to 'strsplit' has been PatB> removed, but it is still mentioned in the argument PatB> items in the help file for 'fixed' and 'perl'. Indeed; thank you Pat! I've committed a fix. Martin PatB> -- Patrick Burns pbu...@pburns.seanet.com twitter: PatB> @portfolioprobe http://www.portfolioprobe.com/blog PatB> http://www.burns-stat.com (home of 'Some hints for the PatB> R beginner' and 'The R Inferno') __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] aperm() should retain class of input object
aperm() was designed for multidimensional arrays, but is also useful for table objects, particularly with the lattice, vcd and vcdExtra packages. But aperm() was designed and implemented before other related object classes were conceived, and I propose a small tune-up to make it more generally useful. The problem is that aperm() always returns an object of class 'array', which causes problems for methods designed for table objects. It also requires some package writers to implement both .array and .table methods for the same functionality, usually one in terms of the other. Some examples of unexpected, and initially perplexing results (when only methods for one class are implemented) are shown below. > library(vcd) > pairs(UCBAdmissions, shade=TRUE) > UCB <- aperm(UCBAdmissions, c(2, 1, 3)) > > # UCB is now an array, not a table > pairs(UCB, shade=TRUE) There were 50 or more warnings (use warnings() to see the first 50) > > # fix it, to get pairs.table > class(UCB) <- "table" > pairs(UCB, shade=TRUE) > Of course, I can define a new function, tperm() that does what I think should be the expected behavior: # aperm, for table objects tperm <- function(a, perm, resize = TRUE) { result <- aperm(a, per, resize) class(result) <- class(a) result } But I think it is more natural to include this functionality in aperm() itself. Thus, I propose the following revision of base::aperm(), at the R level: aperm <- function (a, perm, resize = TRUE, keep.class=TRUE) { if (missing(perm)) perm <- integer(0L) result <- .Internal(aperm(a, perm, resize)) if(keep.class) class(result) <- class(a) result } I don't think this would break any existing code, except where someone depended on coercion to an array. The drop-in replacement for aperm would set keep.class=FALSE by default, but I think TRUE is more natural. FWIW, here are the methods for table and array objects from my current (non-representative) session. > methods(class="table") [1] as.data.frame.table barchart.table* cloud.table* contourplot.table* dotplot.table* [6] head.table* levelplot.table*pairs.table* plot.table* print.table [11] summary.table tail.table* Non-visible functions are asterisked > > methods(class="array") [1] anyDuplicated.array as.data.frame.array as.raster.array* barchart.array* contourplot.array* dotplot.array* [7] duplicated.arraylevelplot.array*unique.array -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele StreetWeb:http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] RFC: sapply() limitation from vector to matrix, but not further
Finally finding time to come back to this. Remember that I've started the thread by proposing a version of sapply() which does not just "stop" with making a matrix() from the lapply() result, but instead --- only when the new argument ARRAY = TRUE is set --- may return an array() of any (appropriate) order, in those cases where the lapply() result elements all return an array of the same dim(). On Wed, Dec 1, 2010 at 19:51, Hadley Wickham wrote: >> A downside of that approach is that lapply(X,...) can >> cause a lot of unneeded memory to be allocated (length(X) >> SEXP's). Those SEXP's would be tossed out by simplify() but >> the peak memory usage would remain high. sapply() can >> be written to avoid the intermediate list structure. > > But the upside is reusable code that can be used in multiple places - > what about the simplification code used by mapply and tapply? Why are > there three different implementations of simplification? > > Hadley I have now looked into using a version of what Hadley had proposed. Note (to Bill's point) that the current implementation of sapply() does go via lapply() and that we have vapply() as a faster version of sapply() with less copying (hopefully). Very unfortunately, vapply() .. which was only created 13 months ago, has inherited the ``illogical'' behavior of sapply() in that it does not make up higher rank arrays if the single element is already a matrix (say). ... Consequently, we also need a patch to vapply(), and I do wonder if we should not make "ARRAY=TRUE" the default there, since with vapply() you specify a result value, and if you specify a matrix, the total result should stack these matrices into an array of rank 3, etc. Looking at it, the patch is not so much work... notably if we don't use a new argument but really let FUN.VALUE determine what the result should look like. More comments are stil welcome... Martin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] RFC: sapply() limitation from vector to matrix, but not further
On Wed, Dec 1, 2010 at 3:39 AM, Martin Maechler wrote: > My proposal -- implemented and "make check" tested -- > is to add an optional argument 'ARRAY' > which allows > >> sapply(v, myF, y = 2*(1:5), ARRAY=TRUE) It would reduce the proliferation of arguments if the simplify= argument were extended to allow this, e.g. simplify = "array" or perhaps simplify = n would allow a maximum of n dimensions. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] rJava question
After some trial and error I figured out how to pass matrices from R to java and back using rJava, but this method is not documented and I wonder if there is a better way? Anyway, here is what I found works: (m = matrix(as.double(1:12),3,4)) [shows m as you would expect] jtest <- .jnew("JTest") (v <- .jcall(jtest, '[[D], 'myfunc', .jarray(m), evalArray=FALSE)) [shows v = m + 10] Here the JTest class has a method named myfunc that accepts a double[][] and returns a double[][]. It simply adds 10 to every element. The parameter 'evalArray' is confusing because when evalArray=TRUE the result is NOT evaluated (a list is returned that you then have to apply .jevalArray to do get the answer). There seems to be an option to have a java reference returned instead of the actual matrix. Can the R side manipulate the matrix (on the java side) through this reference? Thanks, Dominick [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel