Re: [Rd] Suggestion: Dimension-sensitive attributes

2009-07-08 Thread SIES 73
Hi,

I agree with Henrik that his suggestion to have "dimension vector attributes" 
working like dimnames (see below) would be an extremely useful infrastructure 
adittion to R.

If this is not considered for R-core, I am happy to try to implement this in a 
package, as a new class. And possibly do the same thing for data frames. Should 
you have any comments, ideas or suggestions about it, please share!

Best,

Enrique

-
Subject: 
From: Henrik Bengtsson  x <- array(1:30, dim=c(2,3,5)) 
> dimnames(x) <- list(c("a", "b"), c("a1", "a2", "a3"), NULL); 
> dimattr(x, "misc") <- list(1:2, list(x=1:5, y=letters[1:8], z=NA), 
> letters[1:5]); 


> y <- x[,1:2,2:3] 
> str(dimnames(y)) 

List of 3 

 $ : chr [1:2] "a" "b"
 $ : chr [1:2] "a1" "a2"
 $ : NULL


> str(dimattr(x, "misc")) 

List of 3 
 $ : int [1:2] 1 2 
 $ :List of 2 
  ..$ x: int [1:5] 1 2 3 4 5 
  ..$ y: chr [1:8] "a" "b" "c" "d" ... 
 $ : chr [1:2] "b" "c" 

 I can imagine this needs to be added in several places and functions such as 
is.vector() needs to be updated etc. It is not a quick migration, but is it 
something worth considering for the future? 

/Henrik 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Suggestion: Dimension-sensitive attributes

2009-07-09 Thread SIES 73
I've also had several use cases where I needed "cell-like" attributes, that is, 
attributes that have the same dimensions as the original array and are 
subsetted in the same way --along all its dimensions.

So we're talking about a way to add metadata to matrices/arrays at 3 possible 
levels:

1) at the "whole object" level: attributes that are not dropped on 
subsetting 
2) at the "dimension" level: attributes that behave like "dimnames", 
i.e. subsetted along each dimension
3) at the "cell" level: attributes that are subsetted in the same way 
as the original array

My proposal would be simpler that Tony's suggestion: like "dimnames", just have 
reserved attribute names for each case, say "objdata", "dimdata", and 
"celldata" (or "objattr", "dimattr" and "cellattr").

On the other hand, Tony's pattern would allow as many attributes of each type 
as necessary (some multiplicity is already possible with the simpler design as 
dimdata or celldata could be lists of lists), at the cost of a more complex 
scheme of attributes that needs to be "parsed" each time.

On Tony's suggestion, "attr.keep.on.subset" and "attr.dimname.like" (and 
possible "attr.cell.like") could be kept on a single list with 3 elements, 
something like:

> attr(x, "attr.subset.with") <- list(object=..., dims=..., cells=...)

Would something like this make sense for R-core --either for standard arrays or 
as a new class-- or would it be better implemented in a package?

Enrique

-Original Message-
From: Tony Plate [mailto:tpl...@acm.org] 
Sent: miércoles, 08 de julio de 2009 18:01
To: r-devel@r-project.org
Cc: Bengoechea Bartolomé Enrique (SIES 73); Henrik Bengtsson
Subject: Re: [Rd] Suggestion: Dimension-sensitive attributes

There have been times when I've thought this could be useful too.

One way to go about it could be to introduce a special attribute that controls 
how attributes are dealt with in subsetting, e.g., "attr.dimname.like".  The 
contents of this would be character data; on subsetting, any attribute that had 
a name appearing in this vector would be treated as a dimension.  At the same 
time, it might be nice to also introduce "attr.keep.on.subset", which would 
specify which attributes should be kept on the result of a subsetting operation 
(could be useful for attributes that specify units).  This of course could be a 
way of implementing Henrik's suggestion: dimattr(x, "misc") <- value would add 
"misc" to the "attr.dimname.like" attribute and also set the attribute 
"misc".  The tricky part would be modifying the "[" methods.   However, 
the most useful would probably be the one for ordinary matrices and arrays, and 
others could be modified when and if their maintainers see the need.

-- Tony Plate

Bengoechea Bartolomé Enrique (SIES 73) wrote:
> Hi,
>
> I agree with Henrik that his suggestion to have "dimension vector attributes" 
> working like dimnames (see below) would be an extremely useful infrastructure 
> adittion to R.
>
> If this is not considered for R-core, I am happy to try to implement this in 
> a package, as a new class. And possibly do the same thing for data frames. 
> Should you have any comments, ideas or suggestions about it, please share!
>
> Best,
>
> Enrique
>
> --
> ---
> Subject: 
> From: Henrik Bengtsson  Date: Sun, 07 Jun 2009 14:42:08 -0700
>
> Hi,
>
> maybe this has been suggested before, but would it be possible, without not 
> breaking too much existing code, to add other "dimension vector attributes" 
> in addition to 'dimnames'? These attributes would then be subsetted just like 
> dimnames. 
>
> Something like this: 
>
>   
>> x <- array(1:30, dim=c(2,3,5))
>> dimnames(x) <- list(c("a", "b"), c("a1", "a2", "a3"), NULL); 
>> dimattr(x, "misc") <- list(1:2, list(x=1:5, y=letters[1:8], z=NA), 
>> letters[1:5]);
>> 
>
>
>   
>> y <- x[,1:2,2:3]
>> str(dimnames(y))
>> 
>
> List of 3 
>
>  $ : chr [1:2] "a" "b"
>  $ : chr [1:2] "a1" "a2"
>  $ : NULL
>
>
>   
>> str(dimattr(x, "misc")) 
>> 
>
> List of 3 
>  $ : int [1:2] 1 2 
>  $ :List of 2 
>   ..$ x: int [1:5] 1 2 3 4 5 
>   ..$ y: chr [1:8] "a" "b" "c" "d" ... 
>  $ : chr [1:2] "b" "c" 
>
>  I can imagine this needs to be added in several places and functions such as 
> is.vector() needs to be updated etc. It is not a quick migration, but is it 
> something worth considering for the future? 
>
> /Henrik 
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>   

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Suggestion: Dimension-sensitive attributes

2009-07-09 Thread SIES 73
> If "objattr", "dimattr" and "cellattr" are lists, they would offer save 
> places for all attributes that should be kept on subsetting. 

My proposed design would be that:

* "objattr" would be a list of attributes (just preserved on subsetting)
* "dimattr" would be a list with as many elements as array dimensions. 
Each element can be any object whose length matches the corresponding array 
dimension's length and that can be itself subsetted with "[": so it could be a 
vector, a list, a data frame...
* "cellattr" would be any object whose dimensions match the array 
dimensions: another array, a data frame...

> In my view this would be very useful, because that way a general solution for 
> data description, like variabel names, variable labels, units, ... could be 
> reached.

Indeed, that's the objective: attaching user-defined metadata that is 
automatically synchronized with subsetting operations to the actual data.

I've had dozens of use cases on my own R programs that needed this type of 
pattern, and seen it implemented in different ways in several classes (xts, 
timeSeries, AnnotatedDataFrame, etc.) As you point, this could offer a unified 
design for a common need.

Enrique

-Original Message-
From: Heinz Tuechler [mailto:tuech...@gmx.at] 
Sent: jueves, 09 de julio de 2009 10:56
To: Bengoechea Bartolomé Enrique (SIES 73); Tony Plate; r-devel@r-project.org
Cc: Henrik Bengtsson
Subject: Re: [Rd] Suggestion: Dimension-sensitive attributes

At 10:01 09.07.2009, SIES 73 wrote:
>I've also had several use cases where I needed "cell-like" attributes, 
>that is, attributes that have the same dimensions as the original array 
>and are subsetted in the same way --along all its dimensions.
>
>So we're talking about a way to add metadata to matrices/arrays at 3 
>possible levels:
>
> 1) at the "whole object" level: 
> attributes that are not dropped on subsetting
> 2) at the "dimension" level: attributes that behave like 
> "dimnames", i.e. subsetted along each dimension
> 3) at the "cell" level: attributes that are subsetted in the 
> same way as the original array
>
>My proposal would be simpler that Tony's
>suggestion: like "dimnames", just have reserved attribute names for 
>each case, say "objdata", "dimdata", and "celldata" (or "objattr", 
>"dimattr" and "cellattr").

If "objattr", "dimattr" and "cellattr" are lists, they would offer save places 
for all attributes that should be kept on subsetting. In my view this would be 
very useful, because that way a general solution for data description, like 
variabel names, variable labels, units, ... could be reached.


>On the other hand, Tony's pattern would allow as many attributes of 
>each type as necessary (some multiplicity is already possible with the 
>simpler design as dimdata or celldata could be lists of lists), at the 
>cost of a more complex scheme of attributes that needs to be "parsed" 
>each time.
>
>On Tony's suggestion, "attr.keep.on.subset" and "attr.dimname.like" 
>(and possible
>"attr.cell.like") could be kept on a single list with 3 elements, 
>something like:
>
> > attr(x, "attr.subset.with") <- list(object=..., dims=..., cells=...)
>
>Would something like this make sense for R-core --either for standard 
>arrays or as a new class-- or would it be better implemented in a 
>package?
>
>Enrique
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Suggestion: Dimension-sensitive attributes

2009-07-09 Thread SIES 73
Very good points. They closely match the current prototype I have written...

> Starting by working on an interface for such object(s) is probably the first 
> step toward a unified solution

Agree. Getting a good API is always the most important step.

> Dimension-level is what seems to the be most needed...

True, and that was Henrik's original suggestion. But I find all three are 
closely related to the same topic (metadata) and as such deserve to be worked 
out together, but if most people agree otherwise, the direction is clear.

> - Object-level, if not linked to any dimension-attribute is such saying that 
> one want to attach anything to any object. That's what attr() is already 
> doing.

Except that plain attributes are dropped when subsetting. I've found myself 
dozens of times creating classes must to create a `[` method for them that 
preserves some attributes. This looks like such a common situation that having 
a mechanism to avoid the user programming the same stuff again and again would 
be handy.

> - Cell-level, is may be out-of-scope for one first trial (but may be I missed 
> the use-cases for it)

Although I agree that cell-level is far less common, here are a couple of use 
cases I've hit recently:

1) the array represents time series in columns. The original data comes in a 
different frequency for each column, with some data missing. When you align to 
a common frequency and interpolate missing values, I needed a factor array of 
the same dimension as the data array identifying whether each observation 
corresponded to the actual original series, or had been interpolated, and 
whether interpolation was due to missing data or to frequency alignment. Of 
course, I needed the factor array to be subsetted together with the array.

2) the array is a table representing data to be formatted by a reporting system 
(Sweave, R2HTML, etc), similar to the 'xtable' class. So I needed to associate 
formatting information to each individual "cell" (font, color, borders...), as 
well to each dimension and to the whole table.

Anyway, it's far easier to add "cell-level" metadata on top of the other 
features with a new class: for `[` subscripting just call NextMethod() and then 
apply the same indexes to the object storing the cell-level metadata. But I 
still think it's useful to work out data object's metadata at all possible 
levels with a unified interface.


About the subscripting `[` methods, I don't see the need to modify `[<-` for 
arrays, as out-of-bound indexes generate errors with arrays (unlike vectors or 
data frames), so `[<-` would only replace data and leave metadata untouched. Am 
I missing something? 

> may be a function called "dimmeta()" (for consistency with "dimnames()") ? 

I'm using 'dimdata' in my current prototype, and Henrik suggested 'dimattr', 
but I really like your proposal more. 

Wrappers to the two first elements of 'dimmeta' for 2-dim arrays could be added 
in the same vein as 'rownames' and 'colnames': 'rowmeta' and 'colmeta'.

> The signature could be dimmeta(x, i), with x the object, 

For consistency with 'dimnames', the 'i' argument could be dropped and use 
dimmeta(x)[[i]] instead...


Other standard generics to be affected would be:

 * rbind & cbind for 2-dim arrays/matrices: they should combine the metadata, 
and for dimension-sensitive metadata can be modelled upon what is done with 
dimnames: use rowmeta (colmeta) of the first object with them in cbind (rbind), 
and combine colmeta (rowmeta) of all objects with them, filling with 
NAs/NULLs/.. for non metadata-sensitive objects being combined. An issue of 
coercing dimmeta of different classes may arise.

 * `dim<-`, but this may raise the same problem of coercing dimmeta of 
different classes.


...and I agree with the rest of your comments.

Best,

Enrique

-Original Message-
From: Laurent Gautier [mailto:lgaut...@gmail.com] 
Sent: jueves, 09 de julio de 2009 14:15
Cc: Heinz Tuechler; Bengoechea Bartolomé Enrique (SIES 73); Tony Plate; Henrik 
Bengtsson; r-devel@r-project.org
Subject: Re: [Rd] Suggestion: Dimension-sensitive attributes

Starting by working on an interface for such object(s) is probably the first 
step toward a unified solution, and this before about if and how R attributes 
are used.

It would also help to ensure a smooth transition from the existing classes 
implementing a similar solution (first the interface is added to those classes, 
then after a grace period the classes are eventually refactored).

Dimension-level is what seems to the be most needed... but I am not convinced 
of the practicality of the object-level, and cell-level scheme s proposed:

- Object-level, if not linked to any dimension-attribute is such saying that 
one want to

Re: [Rd] Suggestion: Dimension-sensitive attributes

2009-07-09 Thread SIES 73
Forgot to answer this one:

> It would seem natural that metadata associated with one dimension:
> would a table-like object  

Right. A data frame has the problem that for most use cases one would want that 
each dimension length matches the *rows* of the data frame instead of the 
columns, but it is the columns what we would have "for free" when allowing 
"dimmeta" elements to be lists...

Enrique

-Original Message-
From: Laurent Gautier [mailto:lgaut...@gmail.com] 
Sent: jueves, 09 de julio de 2009 14:15
Cc: Heinz Tuechler; Bengoechea Bartolomé Enrique (SIES 73); Tony Plate; Henrik 
Bengtsson; r-devel@r-project.org
Subject: Re: [Rd] Suggestion: Dimension-sensitive attributes

Starting by working on an interface for such object(s) is probably the first 
step toward a unified solution, and this before about if and how R attributes 
are used.

It would also help to ensure a smooth transition from the existing classes 
implementing a similar solution (first the interface is added to those classes, 
then after a grace period the classes are eventually refactored).

Dimension-level is what seems to the be most needed... but I am not convinced 
of the practicality of the object-level, and cell-level scheme s proposed:

- Object-level, if not linked to any dimension-attribute is such saying that 
one want to attach anything to any object. That's what attr() is already doing.

- Cell-level, is may be out-of-scope for one first trial (but may be I missed 
the use-cases for it)



If starting with behaviour, it seems to boil to having "["/"[<-" and 
"dimmeta()"/"dimmeta<-()", :

- extract "[" / replace "[<-" :

   * keeps working the way it already does

   * extracts a subset of the object as well as a subset of the 
dimension-associated metadata.

   * departing too much from the way "[" is working and add 
behind-the-curtain name matching will only compromise the chances of 
adoption.

   * forget about the bit about which metadata is kept and which one 
isn't when using "[". Make a function "unmeta()" (similar behavior to 
"unname()") to drop them all, or work it out with something like
 > dimmeta(x, 1) <- NULL # drop the metadata associated with dimension 1

- access the dimension-associated metadata:

   * may be a function called "dimmeta()" (for consistency with 
"dimnames()") ? The signature could be dimmeta(x, i), with x the object, 
and i the dimension requested. A replace function "dimmeta<-"(x, i, 
value) would be provided.


In the abstract the "names" associated with a given dimension is just 
one of possible metadata, but I'd keep away from meddling with it for a 
start.


It would seem natural that metadata associated with one dimension:
would a table-like object (data.frame seems natural in R, and 
unfortunately there is no data.frame-like structure in R).



L.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Suggestion: Dimension-sensitive attributes

2009-07-10 Thread SIES 73
> In the case the metadata are stored in a list, that interface enforces the 
> building of a list.
> (I said to ignore implementation for now, but paradoxically this made me 
> consider possible implementations).

Creating the list on the fly if it's not stored internally as a list should be 
cheap. For example, this is done with data frames, that store "dimnames" in two 
separate attributes, "names" and "row.names".

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] extract function "[" and empty index

2008-03-10 Thread SIES 73
Beware x[TRUE] returns the same as x[] only if x is NOT a zero-length vector 
(at least until R 2.5.1):

> numeric(0)[]
numeric(0)

> numeric(0)[TRUE]
[1] NA


Enrique

> -Original Message-
> --
> 
> Message: 7
> Date: Mon, 10 Mar 2008 02:29:32 +0800
> From: "Laurent Gautier" <[EMAIL PROTECTED]>
> Subject: Re: [Rd] extract function "[" and empty index
> To: "Gabor Grothendieck" <[EMAIL PROTECTED]>
> Cc: r-devel@r-project.org
> Message-ID:
>   <[EMAIL PROTECTED]>
> Content-Type: text/plain; charset=ISO-8859-1
> 
> Thanks, I was forgetting the recycling rule.
> 
> 
> L.
> 
> 
> 2008/3/9, Gabor Grothendieck <[EMAIL PROTECTED]>:
> > Use TRUE.
> >
> >
> >  On Sun, Mar 9, 2008 at 5:05 AM, Laurent Gautier 
> <[EMAIL PROTECTED]> wrote:
> >  > Dear list,
> >  >
> >  > I am having a question regarding the extract function "[".
> >  >
> >  > The man page says that one usage with k-dimensional arrays is to
> >  > specify k indices to "[", with an empty index indicating that all
> >  > entries in that dimension are selected.
> >  >
> >  > The question is the following: is there an R object 
> qualifying as an
> >  > "empty index" ? I understand that the lazy evaluation of 
> parameters
> >  > allows one to
> >  > have genuinely missing parameters, but I would like to 
> have an object
> >  > instead. I understand that one can always have an 
> if/else workaround,
> >  > but I thought should ask, just in case.
> >  >
> >  > I tried with NULL but with little success, as it appears 
> to give the
> >  > same results
> >  > as an empty vector.
> >  > > m = matrix(1, 2, 2)
> >  > > m[1, NULL]
> >  > numeric(0)
> >  > > m[1, integer(0)]
> >  > numeric(0)
> >  >
> >  > Since I was at it, I noted that the result obtained with 
> "numeric(0)"
> >  > definitely makes sense but could as well be seen as 
> challenging the
> >  > concept of an empty index presented in the man page (One 
> could somehow
> >  > expect the presence of an object  meaning "everything" 
> rather than
> >  > "missingness" meaning it).
> >  >
> >  >
> >  > Thanks,
> >  >
> >  >
> >  > Laurent
> >  >
> >
> > > __
> >  > R-devel@r-project.org mailing list
> >  > https://stat.ethz.ch/mailman/listinfo/r-devel
> >  >
> >
> 
> 
> -- 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Editing the "..." argument

2008-07-07 Thread SIES 73
The 'modifyList' function on package Utils allows to do that in a very compact 
way:

do.call("optim", modifyList(list(), list(...)))

Regards,

Enrique

Mathieu Ribatet wrote:
> Dear all,
> 
> I'd like tweaking the ... arguments that one user can pass in my 
> function for fitting a model. More precisely, my objective function is
> (really) problematic to optimize using the "optim" function. 
> Consequently, I'd like to add in the "control" argument of the latter 
> function a "ndeps = rep(something, #par)" and/or "parscale = something"
> if the user has not specified it already.
> 
> Do you know a way to deal with this point?
> In advance, thanks.
> 
> Mathieu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Embeding R

2009-01-21 Thread SIES 73
Hi,

You may find the biocep-R project very useful, it embeds R into Java using JRI. 
And it goes much further than that, providing an impressive framework from 
which you can start with a lot of work already done. It's open source, so you 
can also just have a look at the code for inspiration:

http://biocep-distrib.r-forge.r-project.org/

It hasn't been released yet (no documentation) but it's really worth exploring.

Best,

Enrique

-Original Message-
From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On 
Behalf Of r-devel-requ...@r-project.org
Sent: miércoles, 21 de enero de 2009 12:00
To: r-devel@r-project.org
Subject: R-devel Digest, Vol 71, Issue 19


Message: 8
Date: Tue, 20 Jan 2009 16:37:11 +0100
From: "Sylvain Loiseau" 
Subject: [Rd] Embeding R
To: r-devel@r-project.org
Message-ID: 
Content-Type: text/plain; format=flowed; delsp=yes; charset=utf-8

Hi,

I'm planning to embed R into an application, with the following context:

- This application is written in Java (and managed with maven). I plan  
accessing R using JRI.
- This application must be installable on several plateform (linux, mac  
os, windows).
- The R engine must embed library, some of them having native code in C or  
Fortran.

Does this sound reasonable? I would be very grateful to everyone providing  
links, references, feedback or advice on this question.

Best regards,
Sylvain

-- 
Sylvain Loiseau
slois...@ens-lsh.fr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel