Re: [Rd] Suggestion: Dimension-sensitive attributes
Hi, I agree with Henrik that his suggestion to have "dimension vector attributes" working like dimnames (see below) would be an extremely useful infrastructure adittion to R. If this is not considered for R-core, I am happy to try to implement this in a package, as a new class. And possibly do the same thing for data frames. Should you have any comments, ideas or suggestions about it, please share! Best, Enrique - Subject: From: Henrik Bengtsson x <- array(1:30, dim=c(2,3,5)) > dimnames(x) <- list(c("a", "b"), c("a1", "a2", "a3"), NULL); > dimattr(x, "misc") <- list(1:2, list(x=1:5, y=letters[1:8], z=NA), > letters[1:5]); > y <- x[,1:2,2:3] > str(dimnames(y)) List of 3 $ : chr [1:2] "a" "b" $ : chr [1:2] "a1" "a2" $ : NULL > str(dimattr(x, "misc")) List of 3 $ : int [1:2] 1 2 $ :List of 2 ..$ x: int [1:5] 1 2 3 4 5 ..$ y: chr [1:8] "a" "b" "c" "d" ... $ : chr [1:2] "b" "c" I can imagine this needs to be added in several places and functions such as is.vector() needs to be updated etc. It is not a quick migration, but is it something worth considering for the future? /Henrik __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Suggestion: Dimension-sensitive attributes
I've also had several use cases where I needed "cell-like" attributes, that is, attributes that have the same dimensions as the original array and are subsetted in the same way --along all its dimensions. So we're talking about a way to add metadata to matrices/arrays at 3 possible levels: 1) at the "whole object" level: attributes that are not dropped on subsetting 2) at the "dimension" level: attributes that behave like "dimnames", i.e. subsetted along each dimension 3) at the "cell" level: attributes that are subsetted in the same way as the original array My proposal would be simpler that Tony's suggestion: like "dimnames", just have reserved attribute names for each case, say "objdata", "dimdata", and "celldata" (or "objattr", "dimattr" and "cellattr"). On the other hand, Tony's pattern would allow as many attributes of each type as necessary (some multiplicity is already possible with the simpler design as dimdata or celldata could be lists of lists), at the cost of a more complex scheme of attributes that needs to be "parsed" each time. On Tony's suggestion, "attr.keep.on.subset" and "attr.dimname.like" (and possible "attr.cell.like") could be kept on a single list with 3 elements, something like: > attr(x, "attr.subset.with") <- list(object=..., dims=..., cells=...) Would something like this make sense for R-core --either for standard arrays or as a new class-- or would it be better implemented in a package? Enrique -Original Message- From: Tony Plate [mailto:tpl...@acm.org] Sent: miércoles, 08 de julio de 2009 18:01 To: r-devel@r-project.org Cc: Bengoechea Bartolomé Enrique (SIES 73); Henrik Bengtsson Subject: Re: [Rd] Suggestion: Dimension-sensitive attributes There have been times when I've thought this could be useful too. One way to go about it could be to introduce a special attribute that controls how attributes are dealt with in subsetting, e.g., "attr.dimname.like". The contents of this would be character data; on subsetting, any attribute that had a name appearing in this vector would be treated as a dimension. At the same time, it might be nice to also introduce "attr.keep.on.subset", which would specify which attributes should be kept on the result of a subsetting operation (could be useful for attributes that specify units). This of course could be a way of implementing Henrik's suggestion: dimattr(x, "misc") <- value would add "misc" to the "attr.dimname.like" attribute and also set the attribute "misc". The tricky part would be modifying the "[" methods. However, the most useful would probably be the one for ordinary matrices and arrays, and others could be modified when and if their maintainers see the need. -- Tony Plate Bengoechea Bartolomé Enrique (SIES 73) wrote: > Hi, > > I agree with Henrik that his suggestion to have "dimension vector attributes" > working like dimnames (see below) would be an extremely useful infrastructure > adittion to R. > > If this is not considered for R-core, I am happy to try to implement this in > a package, as a new class. And possibly do the same thing for data frames. > Should you have any comments, ideas or suggestions about it, please share! > > Best, > > Enrique > > -- > --- > Subject: > From: Henrik Bengtsson Date: Sun, 07 Jun 2009 14:42:08 -0700 > > Hi, > > maybe this has been suggested before, but would it be possible, without not > breaking too much existing code, to add other "dimension vector attributes" > in addition to 'dimnames'? These attributes would then be subsetted just like > dimnames. > > Something like this: > > >> x <- array(1:30, dim=c(2,3,5)) >> dimnames(x) <- list(c("a", "b"), c("a1", "a2", "a3"), NULL); >> dimattr(x, "misc") <- list(1:2, list(x=1:5, y=letters[1:8], z=NA), >> letters[1:5]); >> > > > >> y <- x[,1:2,2:3] >> str(dimnames(y)) >> > > List of 3 > > $ : chr [1:2] "a" "b" > $ : chr [1:2] "a1" "a2" > $ : NULL > > > >> str(dimattr(x, "misc")) >> > > List of 3 > $ : int [1:2] 1 2 > $ :List of 2 > ..$ x: int [1:5] 1 2 3 4 5 > ..$ y: chr [1:8] "a" "b" "c" "d" ... > $ : chr [1:2] "b" "c" > > I can imagine this needs to be added in several places and functions such as > is.vector() needs to be updated etc. It is not a quick migration, but is it > something worth considering for the future? > > /Henrik > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Suggestion: Dimension-sensitive attributes
> If "objattr", "dimattr" and "cellattr" are lists, they would offer save > places for all attributes that should be kept on subsetting. My proposed design would be that: * "objattr" would be a list of attributes (just preserved on subsetting) * "dimattr" would be a list with as many elements as array dimensions. Each element can be any object whose length matches the corresponding array dimension's length and that can be itself subsetted with "[": so it could be a vector, a list, a data frame... * "cellattr" would be any object whose dimensions match the array dimensions: another array, a data frame... > In my view this would be very useful, because that way a general solution for > data description, like variabel names, variable labels, units, ... could be > reached. Indeed, that's the objective: attaching user-defined metadata that is automatically synchronized with subsetting operations to the actual data. I've had dozens of use cases on my own R programs that needed this type of pattern, and seen it implemented in different ways in several classes (xts, timeSeries, AnnotatedDataFrame, etc.) As you point, this could offer a unified design for a common need. Enrique -Original Message- From: Heinz Tuechler [mailto:tuech...@gmx.at] Sent: jueves, 09 de julio de 2009 10:56 To: Bengoechea Bartolomé Enrique (SIES 73); Tony Plate; r-devel@r-project.org Cc: Henrik Bengtsson Subject: Re: [Rd] Suggestion: Dimension-sensitive attributes At 10:01 09.07.2009, SIES 73 wrote: >I've also had several use cases where I needed "cell-like" attributes, >that is, attributes that have the same dimensions as the original array >and are subsetted in the same way --along all its dimensions. > >So we're talking about a way to add metadata to matrices/arrays at 3 >possible levels: > > 1) at the "whole object" level: > attributes that are not dropped on subsetting > 2) at the "dimension" level: attributes that behave like > "dimnames", i.e. subsetted along each dimension > 3) at the "cell" level: attributes that are subsetted in the > same way as the original array > >My proposal would be simpler that Tony's >suggestion: like "dimnames", just have reserved attribute names for >each case, say "objdata", "dimdata", and "celldata" (or "objattr", >"dimattr" and "cellattr"). If "objattr", "dimattr" and "cellattr" are lists, they would offer save places for all attributes that should be kept on subsetting. In my view this would be very useful, because that way a general solution for data description, like variabel names, variable labels, units, ... could be reached. >On the other hand, Tony's pattern would allow as many attributes of >each type as necessary (some multiplicity is already possible with the >simpler design as dimdata or celldata could be lists of lists), at the >cost of a more complex scheme of attributes that needs to be "parsed" >each time. > >On Tony's suggestion, "attr.keep.on.subset" and "attr.dimname.like" >(and possible >"attr.cell.like") could be kept on a single list with 3 elements, >something like: > > > attr(x, "attr.subset.with") <- list(object=..., dims=..., cells=...) > >Would something like this make sense for R-core --either for standard >arrays or as a new class-- or would it be better implemented in a >package? > >Enrique > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Suggestion: Dimension-sensitive attributes
Very good points. They closely match the current prototype I have written... > Starting by working on an interface for such object(s) is probably the first > step toward a unified solution Agree. Getting a good API is always the most important step. > Dimension-level is what seems to the be most needed... True, and that was Henrik's original suggestion. But I find all three are closely related to the same topic (metadata) and as such deserve to be worked out together, but if most people agree otherwise, the direction is clear. > - Object-level, if not linked to any dimension-attribute is such saying that > one want to attach anything to any object. That's what attr() is already > doing. Except that plain attributes are dropped when subsetting. I've found myself dozens of times creating classes must to create a `[` method for them that preserves some attributes. This looks like such a common situation that having a mechanism to avoid the user programming the same stuff again and again would be handy. > - Cell-level, is may be out-of-scope for one first trial (but may be I missed > the use-cases for it) Although I agree that cell-level is far less common, here are a couple of use cases I've hit recently: 1) the array represents time series in columns. The original data comes in a different frequency for each column, with some data missing. When you align to a common frequency and interpolate missing values, I needed a factor array of the same dimension as the data array identifying whether each observation corresponded to the actual original series, or had been interpolated, and whether interpolation was due to missing data or to frequency alignment. Of course, I needed the factor array to be subsetted together with the array. 2) the array is a table representing data to be formatted by a reporting system (Sweave, R2HTML, etc), similar to the 'xtable' class. So I needed to associate formatting information to each individual "cell" (font, color, borders...), as well to each dimension and to the whole table. Anyway, it's far easier to add "cell-level" metadata on top of the other features with a new class: for `[` subscripting just call NextMethod() and then apply the same indexes to the object storing the cell-level metadata. But I still think it's useful to work out data object's metadata at all possible levels with a unified interface. About the subscripting `[` methods, I don't see the need to modify `[<-` for arrays, as out-of-bound indexes generate errors with arrays (unlike vectors or data frames), so `[<-` would only replace data and leave metadata untouched. Am I missing something? > may be a function called "dimmeta()" (for consistency with "dimnames()") ? I'm using 'dimdata' in my current prototype, and Henrik suggested 'dimattr', but I really like your proposal more. Wrappers to the two first elements of 'dimmeta' for 2-dim arrays could be added in the same vein as 'rownames' and 'colnames': 'rowmeta' and 'colmeta'. > The signature could be dimmeta(x, i), with x the object, For consistency with 'dimnames', the 'i' argument could be dropped and use dimmeta(x)[[i]] instead... Other standard generics to be affected would be: * rbind & cbind for 2-dim arrays/matrices: they should combine the metadata, and for dimension-sensitive metadata can be modelled upon what is done with dimnames: use rowmeta (colmeta) of the first object with them in cbind (rbind), and combine colmeta (rowmeta) of all objects with them, filling with NAs/NULLs/.. for non metadata-sensitive objects being combined. An issue of coercing dimmeta of different classes may arise. * `dim<-`, but this may raise the same problem of coercing dimmeta of different classes. ...and I agree with the rest of your comments. Best, Enrique -Original Message- From: Laurent Gautier [mailto:lgaut...@gmail.com] Sent: jueves, 09 de julio de 2009 14:15 Cc: Heinz Tuechler; Bengoechea Bartolomé Enrique (SIES 73); Tony Plate; Henrik Bengtsson; r-devel@r-project.org Subject: Re: [Rd] Suggestion: Dimension-sensitive attributes Starting by working on an interface for such object(s) is probably the first step toward a unified solution, and this before about if and how R attributes are used. It would also help to ensure a smooth transition from the existing classes implementing a similar solution (first the interface is added to those classes, then after a grace period the classes are eventually refactored). Dimension-level is what seems to the be most needed... but I am not convinced of the practicality of the object-level, and cell-level scheme s proposed: - Object-level, if not linked to any dimension-attribute is such saying that one want to
Re: [Rd] Suggestion: Dimension-sensitive attributes
Forgot to answer this one: > It would seem natural that metadata associated with one dimension: > would a table-like object Right. A data frame has the problem that for most use cases one would want that each dimension length matches the *rows* of the data frame instead of the columns, but it is the columns what we would have "for free" when allowing "dimmeta" elements to be lists... Enrique -Original Message- From: Laurent Gautier [mailto:lgaut...@gmail.com] Sent: jueves, 09 de julio de 2009 14:15 Cc: Heinz Tuechler; Bengoechea Bartolomé Enrique (SIES 73); Tony Plate; Henrik Bengtsson; r-devel@r-project.org Subject: Re: [Rd] Suggestion: Dimension-sensitive attributes Starting by working on an interface for such object(s) is probably the first step toward a unified solution, and this before about if and how R attributes are used. It would also help to ensure a smooth transition from the existing classes implementing a similar solution (first the interface is added to those classes, then after a grace period the classes are eventually refactored). Dimension-level is what seems to the be most needed... but I am not convinced of the practicality of the object-level, and cell-level scheme s proposed: - Object-level, if not linked to any dimension-attribute is such saying that one want to attach anything to any object. That's what attr() is already doing. - Cell-level, is may be out-of-scope for one first trial (but may be I missed the use-cases for it) If starting with behaviour, it seems to boil to having "["/"[<-" and "dimmeta()"/"dimmeta<-()", : - extract "[" / replace "[<-" : * keeps working the way it already does * extracts a subset of the object as well as a subset of the dimension-associated metadata. * departing too much from the way "[" is working and add behind-the-curtain name matching will only compromise the chances of adoption. * forget about the bit about which metadata is kept and which one isn't when using "[". Make a function "unmeta()" (similar behavior to "unname()") to drop them all, or work it out with something like > dimmeta(x, 1) <- NULL # drop the metadata associated with dimension 1 - access the dimension-associated metadata: * may be a function called "dimmeta()" (for consistency with "dimnames()") ? The signature could be dimmeta(x, i), with x the object, and i the dimension requested. A replace function "dimmeta<-"(x, i, value) would be provided. In the abstract the "names" associated with a given dimension is just one of possible metadata, but I'd keep away from meddling with it for a start. It would seem natural that metadata associated with one dimension: would a table-like object (data.frame seems natural in R, and unfortunately there is no data.frame-like structure in R). L. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Suggestion: Dimension-sensitive attributes
> In the case the metadata are stored in a list, that interface enforces the > building of a list. > (I said to ignore implementation for now, but paradoxically this made me > consider possible implementations). Creating the list on the fly if it's not stored internally as a list should be cheap. For example, this is done with data frames, that store "dimnames" in two separate attributes, "names" and "row.names". __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] extract function "[" and empty index
Beware x[TRUE] returns the same as x[] only if x is NOT a zero-length vector (at least until R 2.5.1): > numeric(0)[] numeric(0) > numeric(0)[TRUE] [1] NA Enrique > -Original Message- > -- > > Message: 7 > Date: Mon, 10 Mar 2008 02:29:32 +0800 > From: "Laurent Gautier" <[EMAIL PROTECTED]> > Subject: Re: [Rd] extract function "[" and empty index > To: "Gabor Grothendieck" <[EMAIL PROTECTED]> > Cc: r-devel@r-project.org > Message-ID: > <[EMAIL PROTECTED]> > Content-Type: text/plain; charset=ISO-8859-1 > > Thanks, I was forgetting the recycling rule. > > > L. > > > 2008/3/9, Gabor Grothendieck <[EMAIL PROTECTED]>: > > Use TRUE. > > > > > > On Sun, Mar 9, 2008 at 5:05 AM, Laurent Gautier > <[EMAIL PROTECTED]> wrote: > > > Dear list, > > > > > > I am having a question regarding the extract function "[". > > > > > > The man page says that one usage with k-dimensional arrays is to > > > specify k indices to "[", with an empty index indicating that all > > > entries in that dimension are selected. > > > > > > The question is the following: is there an R object > qualifying as an > > > "empty index" ? I understand that the lazy evaluation of > parameters > > > allows one to > > > have genuinely missing parameters, but I would like to > have an object > > > instead. I understand that one can always have an > if/else workaround, > > > but I thought should ask, just in case. > > > > > > I tried with NULL but with little success, as it appears > to give the > > > same results > > > as an empty vector. > > > > m = matrix(1, 2, 2) > > > > m[1, NULL] > > > numeric(0) > > > > m[1, integer(0)] > > > numeric(0) > > > > > > Since I was at it, I noted that the result obtained with > "numeric(0)" > > > definitely makes sense but could as well be seen as > challenging the > > > concept of an empty index presented in the man page (One > could somehow > > > expect the presence of an object meaning "everything" > rather than > > > "missingness" meaning it). > > > > > > > > > Thanks, > > > > > > > > > Laurent > > > > > > > > __ > > > R-devel@r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > > > > > > -- __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Editing the "..." argument
The 'modifyList' function on package Utils allows to do that in a very compact way: do.call("optim", modifyList(list(), list(...))) Regards, Enrique Mathieu Ribatet wrote: > Dear all, > > I'd like tweaking the ... arguments that one user can pass in my > function for fitting a model. More precisely, my objective function is > (really) problematic to optimize using the "optim" function. > Consequently, I'd like to add in the "control" argument of the latter > function a "ndeps = rep(something, #par)" and/or "parscale = something" > if the user has not specified it already. > > Do you know a way to deal with this point? > In advance, thanks. > > Mathieu __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Embeding R
Hi, You may find the biocep-R project very useful, it embeds R into Java using JRI. And it goes much further than that, providing an impressive framework from which you can start with a lot of work already done. It's open source, so you can also just have a look at the code for inspiration: http://biocep-distrib.r-forge.r-project.org/ It hasn't been released yet (no documentation) but it's really worth exploring. Best, Enrique -Original Message- From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On Behalf Of r-devel-requ...@r-project.org Sent: miércoles, 21 de enero de 2009 12:00 To: r-devel@r-project.org Subject: R-devel Digest, Vol 71, Issue 19 Message: 8 Date: Tue, 20 Jan 2009 16:37:11 +0100 From: "Sylvain Loiseau" Subject: [Rd] Embeding R To: r-devel@r-project.org Message-ID: Content-Type: text/plain; format=flowed; delsp=yes; charset=utf-8 Hi, I'm planning to embed R into an application, with the following context: - This application is written in Java (and managed with maven). I plan accessing R using JRI. - This application must be installable on several plateform (linux, mac os, windows). - The R engine must embed library, some of them having native code in C or Fortran. Does this sound reasonable? I would be very grateful to everyone providing links, references, feedback or advice on this question. Best regards, Sylvain -- Sylvain Loiseau slois...@ens-lsh.fr __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel