from:"Morgan"

[Rd] Package inclusion in R core implementation

2019-03-01 Thread Morgan Morgan

Hi,

It sometimes happens that some packages get included to R like for example
the parallel package.

I was wondering if there is a process to decide whether or not to include a
package in the core implementation of R?

For example, why not include the Rcpp package, which became for a lot of
user the main tool to extend R?

What is our view on the (not so well known) dotCall64 package which is an
interesting alternative for extending R?

Thank you
Best regards,
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] R C API resize matrix

2019-06-17 Thread Morgan Morgan

Hi,

Is there a way to resize a matrix defined as follows:

SEXP a = PROTECT(allocMatrix(INTSXP, 10, 2));
int *pa  = INTEGER(a)

To row = 5 and col = 1 or do I have to allocate a second matrix "b" with
pointer *pb and do a "for" loop to transfer the value of a to b?

Thank you
Best regards
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Convert STRSXP or INTSXP to factor

2019-07-15 Thread Morgan Morgan

Hi,

Using the R C PAI, is there a way to convert to convert STRSXP or INTSXP to
factor.

The idea would be to do in C something similar to the "factor" function
(example below):

> letters[1:5]
# [1] "a" "b" "c" "d" "e"

> factor(letters[1:5])
# [1] a b c d e
# Levels: a b c d e

There is the function setAttrib the levels of a SXP however when returned
to R the object is of type character not factor. Ideally what i would like
to return from the C function is the same output as above when the input is
of type character.

Please let me if you need more informations.
Thank you
Best regards
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Evaluate part of an expression at C level

2019-09-27 Thread Morgan Morgan

Hi,

I am wondering if the below is possible?
Let's assume I have the following expression:

1:10 < 5

Is there a way at the R C API level to only evaluate the 5th element (i.e 5
< 5) instead of evaluating the whole expression and then select the 5th
element in the logical vector?

Thank you
Best regards
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] New matrix function

2019-10-11 Thread Morgan Morgan

Hi All,

I was looking for a function to find a small matrix inside a larger matrix
in R similar to the one described in the following link:

https://www.mathworks.com/matlabcentral/answers/194708-index-a-small-matrix-in-a-larger-matrix

I couldn't find anything.

The above function can be seen as a "generalisation" of the "which"
function as well as the function described in the following post:

https://coolbutuseless.github.io/2018/04/03/finding-a-length-n-needle-in-a-haystack/

Would be possible to add such a function to base R?

I am happy to work with someone from the R core team (if you wish) and
suggest an implementation in C.

Thank you
Best regards,
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] New matrix function

2019-10-11 Thread Morgan Morgan

On Fri, 11 Oct 2019 10:45 Duncan Murdoch,  wrote:

> On 11/10/2019 6:44 a.m., Morgan Morgan wrote:
> > Hi All,
> >
> > I was looking for a function to find a small matrix inside a larger
> matrix
> > in R similar to the one described in the following link:
> >
> >
> https://www.mathworks.com/matlabcentral/answers/194708-index-a-small-matrix-in-a-larger-matrix
> >
> > I couldn't find anything.
> >
> > The above function can be seen as a "generalisation" of the "which"
> > function as well as the function described in the following post:
> >
> >
> https://coolbutuseless.github.io/2018/04/03/finding-a-length-n-needle-in-a-haystack/
> >
> > Would be possible to add such a function to base R?
> >
> > I am happy to work with someone from the R core team (if you wish) and
> > suggest an implementation in C.
>
> That seems like it would sometimes be a useful function, and maybe
> someone will point out a package that already contains it.  But if not,
> why would it belong in base R?
>

If someone already implemented it, that would great indeed. I think it is a
very general and basic function, hence base R could be a good place for it?

But this is probably not a good reason; maybe someone from the R core team
can shed some light on how they decide whether or not to include a function
in base R?


> Duncan Murdoch
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] New matrix function

2019-10-11 Thread Morgan Morgan

How do you prove usefulness of a feature?
Do you have an example of a feature that has been added after proving to be
useful in the package space first?

Thank you,
Morgan

On Fri, 11 Oct 2019 13:53 Michael Lawrence, 
wrote:

> Thanks for this interesting suggestion, Morgan. While there is no strict
> criteria for base R inclusion, one criterion relevant in this case is that
> the usefulness of a feature be proven in the package space first.
>
> Michael
>
>
> On Fri, Oct 11, 2019 at 5:19 AM Morgan Morgan 
> wrote:
>
>> On Fri, 11 Oct 2019 10:45 Duncan Murdoch, 
>> wrote:
>>
>> > On 11/10/2019 6:44 a.m., Morgan Morgan wrote:
>> > > Hi All,
>> > >
>> > > I was looking for a function to find a small matrix inside a larger
>> > matrix
>> > > in R similar to the one described in the following link:
>> > >
>> > >
>> >
>> https://www.mathworks.com/matlabcentral/answers/194708-index-a-small-matrix-in-a-larger-matrix
>> > >
>> > > I couldn't find anything.
>> > >
>> > > The above function can be seen as a "generalisation" of the "which"
>> > > function as well as the function described in the following post:
>> > >
>> > >
>> >
>> https://coolbutuseless.github.io/2018/04/03/finding-a-length-n-needle-in-a-haystack/
>> > >
>> > > Would be possible to add such a function to base R?
>> > >
>> > > I am happy to work with someone from the R core team (if you wish) and
>> > > suggest an implementation in C.
>> >
>> > That seems like it would sometimes be a useful function, and maybe
>> > someone will point out a package that already contains it.  But if not,
>> > why would it belong in base R?
>> >
>>
>> If someone already implemented it, that would great indeed. I think it is
>> a
>> very general and basic function, hence base R could be a good place for
>> it?
>>
>> But this is probably not a good reason; maybe someone from the R core team
>> can shed some light on how they decide whether or not to include a
>> function
>> in base R?
>>
>>
>> > Duncan Murdoch
>> >
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>
> --
> Michael Lawrence
> Scientist, Bioinformatics and Computational Biology
> Genentech, A Member of the Roche Group
> Office +1 (650) 225-7760
> micha...@gene.com
>
> Join Genentech on LinkedIn | Twitter | Facebook | Instagram | YouTube
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] New matrix function

2019-10-11 Thread Morgan Morgan

I think you are confusing package and function here. Plus some of the R
Core packages, that you mention, contain functions that should probably be
replaced by functions with better implementation from packages on CRAN.

Best regards
Morgan

On Fri, 11 Oct 2019 15:22 Joris Meys,  wrote:

>
>
> On Fri, Oct 11, 2019 at 3:55 PM Morgan Morgan 
> wrote:
>
>> How do you prove usefulness of a feature?
>> Do you have an example of a feature that has been added after proving to
>> be
>> useful in the package space first?
>>
>> Thank you,
>> Morgan
>>
>
> The parallel package (a base package like utils, stats, ...) was added as
> a drop-in replacement of the packages snow and multicore for parallel
> computing. That's one example, but sure there's more.
>
> Kind regards
> Joris
>
> --
> Joris Meys
> Statistical consultant
>
> Department of Data Analysis and Mathematical Modelling
> Ghent University
> Coupure Links 653, B-9000 Gent (Belgium)
>
> <https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium&entry=gmail&source=g>
>
> ---
> Biowiskundedagen 2018-2019
> http://www.biowiskundedagen.ugent.be/
>
> ---
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] New matrix function

2019-10-11 Thread Morgan Morgan

Your answer makes much more sense to me.
I will probably end up adding the function to a package.
Some processes and decisions on how R is developed seems to be obscure to
me.

Thank you
Morgan

On Fri, 11 Oct 2019 15:30 Avraham Adler,  wrote:

> It’s rather difficult. For example, the base R Kendall tau is written with
> the naive O(n^2). The much faster O(n log n) implementation was programmed
> and is in the pcaPP package. When I say much faster, I mean that my
> implementation in Excel VBA was faster than R for 10,000 or so pairs.
> R-Core decided not to implement that code, and instead made a note about
> the faster implementation living in pcaPP in the help for “cor”. See [1]
> for the 2012 discussion. My point is it’s really really difficult to get
> something in Base R. Develop it well, put it in a package, and you have
> basically the same result.
>
> Avi
>
> [1] https://stat.ethz.ch/pipermail/r-devel/2012-June/064351.html
>
> On Fri, Oct 11, 2019 at 9:55 AM Morgan Morgan 
> wrote:
>
>> How do you prove usefulness of a feature?
>> Do you have an example of a feature that has been added after proving to
>> be
>> useful in the package space first?
>>
>> Thank you,
>> Morgan
>>
>> On Fri, 11 Oct 2019 13:53 Michael Lawrence, 
>> wrote:
>>
>> > Thanks for this interesting suggestion, Morgan. While there is no strict
>> > criteria for base R inclusion, one criterion relevant in this case is
>> that
>> > the usefulness of a feature be proven in the package space first.
>> >
>> > Michael
>> >
>> >
>> > On Fri, Oct 11, 2019 at 5:19 AM Morgan Morgan <
>> morgan.email...@gmail.com>
>> > wrote:
>> >
>> >> On Fri, 11 Oct 2019 10:45 Duncan Murdoch, 
>> >> wrote:
>> >>
>> >> > On 11/10/2019 6:44 a.m., Morgan Morgan wrote:
>> >> > > Hi All,
>> >> > >
>> >> > > I was looking for a function to find a small matrix inside a larger
>> >> > matrix
>> >> > > in R similar to the one described in the following link:
>> >> > >
>> >> > >
>> >> >
>> >>
>> https://www.mathworks.com/matlabcentral/answers/194708-index-a-small-matrix-in-a-larger-matrix
>> >> > >
>> >> > > I couldn't find anything.
>> >> > >
>> >> > > The above function can be seen as a "generalisation" of the "which"
>> >> > > function as well as the function described in the following post:
>> >> > >
>> >> > >
>> >> >
>> >>
>> https://coolbutuseless.github.io/2018/04/03/finding-a-length-n-needle-in-a-haystack/
>> >> > >
>> >> > > Would be possible to add such a function to base R?
>> >> > >
>> >> > > I am happy to work with someone from the R core team (if you wish)
>> and
>> >> > > suggest an implementation in C.
>> >> >
>> >> > That seems like it would sometimes be a useful function, and maybe
>> >> > someone will point out a package that already contains it.  But if
>> not,
>> >> > why would it belong in base R?
>> >> >
>> >>
>> >> If someone already implemented it, that would great indeed. I think it
>> is
>> >> a
>> >> very general and basic function, hence base R could be a good place for
>> >> it?
>> >>
>> >> But this is probably not a good reason; maybe someone from the R core
>> team
>> >> can shed some light on how they decide whether or not to include a
>> >> function
>> >> in base R?
>> >>
>> >>
>> >> > Duncan Murdoch
>> >> >
>> >>
>> >> [[alternative HTML version deleted]]
>> >>
>> >> __
>> >> R-devel@r-project.org mailing list
>> >> https://stat.ethz.ch/mailman/listinfo/r-devel
>> >>
>> >
>> >
>> > --
>> > Michael Lawrence
>> > Scientist, Bioinformatics and Computational Biology
>> > Genentech, A Member of the Roche Group
>> > Office +1 (650) 225-7760
>> > micha...@gene.com
>> >
>> > Join Genentech on LinkedIn | Twitter | Facebook | Instagram | YouTube
>> >
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
> --
> Sent from Gmail Mobile
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] New matrix function

2019-10-11 Thread Morgan Morgan

Basically the problem is to find the position of a submatrix inside a
larger matrix. Here are some links describing the problem:

https://stackoverflow.com/questions/10529278/fastest-way-to-find-a-m-x-n-submatrix-in-m-x-n-matrix

https://stackoverflow.com/questions/16750739/find-a-matrix-in-a-big-matrix

Best
Morgan

On Fri, 11 Oct 2019 23:36 Gabor Grothendieck, 
wrote:

> The link you posted used the same inputs as in my example. If that is
> not what you meant maybe
> a different example is needed.
> Regards.
>
> On Fri, Oct 11, 2019 at 2:39 PM Pages, Herve  wrote:
> >
> > Has someone looked into the image processing area for this? That sounds
> > a little bit too high-level for base R to me (and I would be surprised
> > if any mainstream programming language had this kind of functionality
> > built-in).
> >
> > H.
> >
> > On 10/11/19 03:44, Morgan Morgan wrote:
> > > Hi All,
> > >
> > > I was looking for a function to find a small matrix inside a larger
> matrix
> > > in R similar to the one described in the following link:
> > >
> > >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.mathworks.com_matlabcentral_answers_194708-2Dindex-2Da-2Dsmall-2Dmatrix-2Din-2Da-2Dlarger-2Dmatrix&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=v96tqHMO3CLNBS7KTmdshM371i6W_v8_2H5bdVy_KHo&s=9Eu0WySIEzrWuYXFhwhHETpZQzi6hHLd84DZsbZsXYY&e=
> > >
> > > I couldn't find anything.
> > >
> > > The above function can be seen as a "generalisation" of the "which"
> > > function as well as the function described in the following post:
> > >
> > >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__coolbutuseless.github.io_2018_04_03_finding-2Da-2Dlength-2Dn-2Dneedle-2Din-2Da-2Dhaystack_&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=v96tqHMO3CLNBS7KTmdshM371i6W_v8_2H5bdVy_KHo&s=qZ3SJ8t8zEDA-em4WT7gBmN66qvvCKKKXRJunoF6P3k&e=
> > >
> > > Would be possible to add such a function to base R?
> > >
> > > I am happy to work with someone from the R core team (if you wish) and
> > > suggest an implementation in C.
> > >
> > > Thank you
> > > Best regards,
> > > Morgan
> > >
> > >   [[alternative HTML version deleted]]
> > >
> > > __
> > > R-devel@r-project.org mailing list
> > >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=v96tqHMO3CLNBS7KTmdshM371i6W_v8_2H5bdVy_KHo&s=tyVSs9EYVBd_dmVm1LSC23GhUzbBv8ULvtsveo-COoU&e=
> > >
> >
> > --
> > Hervé Pagès
> >
> > Program in Computational Biology
> > Division of Public Health Sciences
> > Fred Hutchinson Cancer Research Center
> > 1100 Fairview Ave. N, M1-B514
> > P.O. Box 19024
> > Seattle, WA 98109-1024
> >
> > E-mail: hpa...@fredhutch.org
> > Phone:  (206) 667-5791
> > Fax:(206) 667-1319
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Questions on the R C API

2019-11-04 Thread Morgan Morgan

Hi All,

I have some questions regarding the R C API.

Let's assume I have a function which is defined as follows:

R file:

myfunc <- function(a, b, ...) .External(Cfun, a, b, ...)

C file:

SEXP Cfun(SEXP args) {
  args = CDR(args);
  SEXP a = CAR(args); args = CDR(args);
  SEXP b = CAR(args); args = CDR(args);
  /* continue to do something with remaining arguments in "..." using the
same logic as above*/

  return R_NilValue;
}

1/ Let's suppose that in my c function I change the value of a inside the
function but I want to reset it to what it was when I did SEXP a =
CAR(args); . How can I do that?

2/Is there a method to set "args" at a specific position so I can access a
specific value of my choice? If yes, do you have an simple example?

3/ Let's suppose now, I call the function in R. Is there a way to avoid the
function to evaluate its arguments before going to the C call? Do I have to
do it at the R level or can it be done at the C level?

Thank you very much in advance.
Best regards
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Questions on the R C API

2019-11-04 Thread Morgan Morgan

Thank you for your reply Jiefei.
I think in theory your solution should work. I'll have to give them a try.


On Mon, 4 Nov 2019 23:41 Wang Jiefei,  wrote:

> Hi Morgan,
>
> My solutions might not be the best one(I believe it's not), but it should
> work for your question.
>
> 1. Have you considered Rf_duplicate function? If you want to change the
> value of `a` and reset it later, you have to have a duplication somewhere
> for resetting it. Instead of changing the value of `a` directly, why not
> changing the value of a duplicated `a`? So you do not have to reset it.
>
> 2. I think a pairlist behaves like a linked list(I might be wrong here and
> please correct me if so). Therefore, there is no simple way to locate an
> element in a pairlist. As for as I know, R defines a set of
> convenient functions for you to access a limited number of elements. See
> below
>
> ```
> #define CAR(e) ((e)->u.listsxp.carval)
> #define CDR(e) ((e)->u.listsxp.cdrval)
> #define CAAR(e) CAR(CAR(e))
> #define CDAR(e) CDR(CAR(e))
> #define CADR(e) CAR(CDR(e))
> #define CDDR(e) CDR(CDR(e))
> #define CDDDR(e) CDR(CDR(CDR(e)))
> #define CADDR(e) CAR(CDR(CDR(e)))
> #define CADDDR(e) CAR(CDR(CDR(CDR(e
> #define CAD4R(e) CAR(CDR(CDR(CDR(CDR(e)
> ```
>
> You can use them to get first a few arguments from a pairlist. Another
> solution would be converting the pairlist into a list so that you can use
> the methods defined for a list to access any element. I do not know which C
> function can achieve that but `as.list` at R level should be able to do
> this job, you can evaluate an R function at C level and get the list
> result( By calling `Rf_eval`). I think this operation is relatively low
> cost because the list should only contain a set of pointers pointing to
> each element. There is no object duplication(Again I might be wrong here).
>


So there is no way to reset a pairlist to its first element?


> 3. You can get unevaluated expression at the R level before you call the C
> function and pass it to your C function( by calling `substitute` function).
> However, from my vague memory, the expression would be eventually evaluated
> at the C level even you pass the expression to it. Therefore, I think you
> can create a list of unevaluated arguments before you enter the C function,
> so your C function can expect a list rather than a pairlist as its
> argument. This can solve both your second and third questions.
>

Correct me if I am wrong but does it mean that I will have to change "..."
to "list(...)" and use .Call instead of .External?

Also does it mean that to avoid expression to be evaluated at the R level,
I have to use "list" or "substitute"? The function "switch" in R does not
use them but manage to achieve that.

switch(1, "a", stop("a"))
#[1] "a"

It is a primitive but I don't understand how it manage to do that.

Best,
Morgan



> Best,
> Jiefei
>
>
> On Mon, Nov 4, 2019 at 2:41 PM Morgan Morgan 
> wrote:
>
>> Hi All,
>>
>> I have some questions regarding the R C API.
>>
>> Let's assume I have a function which is defined as follows:
>>
>> R file:
>>
>> myfunc <- function(a, b, ...) .External(Cfun, a, b, ...)
>>
>> C file:
>>
>> SEXP Cfun(SEXP args) {
>>   args = CDR(args);
>>   SEXP a = CAR(args); args = CDR(args);
>>   SEXP b = CAR(args); args = CDR(args);
>>   /* continue to do something with remaining arguments in "..." using the
>> same logic as above*/
>>
>>   return R_NilValue;
>> }
>>
>> 1/ Let's suppose that in my c function I change the value of a inside the
>> function but I want to reset it to what it was when I did SEXP a =
>> CAR(args); . How can I do that?
>>
>> 2/Is there a method to set "args" at a specific position so I can access a
>> specific value of my choice? If yes, do you have an simple example?
>>
>> 3/ Let's suppose now, I call the function in R. Is there a way to avoid
>> the
>> function to evaluate its arguments before going to the C call? Do I have
>> to
>> do it at the R level or can it be done at the C level?
>>
>> Thank you very much in advance.
>> Best regards
>> Morgan
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Aggregate function FR

2019-11-20 Thread Morgan Morgan

Hi,

I was wondering if it would be possible to add an argument to the aggreagte
function to retain NA by categories?(default can not to in order to avoid
breaking code) Please see below example:

df = iris
df$Species[5] = NA
aggregate(`Petal.Width` ~ Species, df, sum) # does not include NA
aggregate(`Petal.Width` ~ addNA(Species), df, sum) # include NA

data.table and dplyr include NA by default.
Python pandas has an aggreagate function inspired by base R aggregate. An
option has been added to include NA.

Thank you
Best regards
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Long vector support in data.frame

2020-01-23 Thread Morgan Morgan

Hi All,

Happy New Year!

I was wondering if there is a plan at some point to support long vectors in
data.frames?
I understand that it would need some internal changes to lift the current
limit.
If there is a plan what is currently preventing it from happening? Is it
time, resources? If so is there a way for people willing to help to
contribute or help the R-dev team? How?
I noticed that an increasing number of function are supporting long vectors
in base R. Is there more functions that need to support long vectors before
having long vectors support in data.frames?

Thank you
Best regards
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Hash functions at C level

2020-05-03 Thread Morgan Morgan

Dear R-dev,

Hope you are all well.
I would like to know if there is a hash function available for the R C API?
I noticed that there are hash structures and functions defined in the file
"unique.c". These would definitly suit my needs, however is there a way to
access them at C level?
Thank you for your time.
Best regards
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] psum/pprod

2020-05-16 Thread Morgan Morgan

Good morning All,

Just wanted to do quick follow-up on this thread:
https://r.789695.n4.nabble.com/There-is-pmin-and-pmax-each-taking-na-rm-how-about-psum-td4647841.html

For those (including the R-core team) of you who are interested in a C
implementation of psum and pprod there is one in the "kit" package (I am
the author) on CRAN.

I will continue working on the package in my spare time if I see that users
are missing basic functionalities not implemented in base R.

Have a great weekend.
Kind regards
Morgan Jacob

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Precision of function mean,bug?

2020-05-20 Thread Morgan Morgan

Hello R-dev,

Yesterday, while I was testing the newly implemented function pmean in
package kit, I noticed a mismatch in the output of the below R expressions.

set.seed(123)
n=1e3L
idx=5
x=rnorm(n)
y=rnorm(n)
z=rnorm(n)
a=(x[idx]+y[idx]+z[idx])/3
b=mean(c(x[idx],y[idx],z[idx]))
a==b
# [1] FALSE

For idx= 1, 2, 3, 4 the last line is equal to TRUE. For 5, 6 and many
others the difference is small but still.
Is that expected or is it a bug?

Thank you
Best Regards
Morgan Jacob

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Precision of function mean,bug?

2020-05-21 Thread Morgan Morgan

Sorry, posting back to the list.
Thank you all.
Morgan

On Thu, 21 May 2020, 16:33 Henrik Bengtsson, 
wrote:

> Hi.
>
> Good point and a good example. Feel free to post to the list. The purpose
> of my reply wasn't to take away Peter's point but to emphasize that
> base::mean() does a two-pass scan over the elements too lower the impact of
> addition of values with widely different values (classical problem in
> numerical analysis). But I can see how it may look like that.
>
> Cheers,
>
> Henrik
>
>
> On Thu, May 21, 2020, 03:21 Morgan Morgan 
> wrote:
>
>> Thank you Henrik for the feedback.
>> Note that for idx=4 and refine = TRUE,  your equality b==c is FALSE. I
>> think that as Peter said == can't be trusted with FP.
>> His example is good. Here is an even more shocking one.
>> a=0.786546798
>> b=a+ 1e6 -1e6
>> a==b
>> # [1] FALSE
>>
>> Best regards
>> Morgan Jacob
>>
>> On Wed, 20 May 2020, 20:18 Henrik Bengtsson, 
>> wrote:
>>
>>> On Wed, May 20, 2020 at 11:10 AM brodie gaslam via R-devel
>>>  wrote:
>>> >
>>> >  > On Wednesday, May 20, 2020, 7:00:09 AM EDT, peter dalgaard <
>>> pda...@gmail.com> wrote:
>>> > >
>>> > > Expected, see FAQ 7.31.
>>> > >
>>> > > You just can't trust == on FP operations. Notice also
>>> >
>>> > Additionally, since you're implementing a "mean" function you are
>>> testing
>>> > against R's mean, you might want to consider that R uses a two-pass
>>> > calculation[1] to reduce floating point precision error.
>>>
>>> This one is important.
>>>
>>> FWIW, matrixStats::mean2() provides argument refine=TRUE/FALSE to
>>> calculate mean with and without this two-pass calculation;
>>>
>>> > a <- c(x[idx],y[idx],z[idx]) / 3
>>> > b <- mean(c(x[idx],y[idx],z[idx]))
>>> > b == a
>>> [1] FALSE
>>> > b - a
>>> [1] 2.220446e-16
>>>
>>> > c <- matrixStats::mean2(c(x[idx],y[idx],z[idx]))  ## default to
>>> refine=TRUE
>>> > b == c
>>> [1] TRUE
>>> > b - c
>>> [1] 0
>>>
>>> > d <- matrixStats::mean2(c(x[idx],y[idx],z[idx]), refine=FALSE)
>>> > a == d
>>> [1] TRUE
>>> > a - d
>>> [1] 0
>>> > c == d
>>> [1] FALSE
>>> > c - d
>>> [1] 2.220446e-16
>>>
>>> Not surprisingly, the two-pass higher-precision version (refine=TRUE)
>>> takes roughly twice as long as the one-pass quick version
>>> (refine=FALSE).
>>>
>>> /Henrik
>>>
>>> >
>>> > Best,
>>> >
>>> > Brodie.
>>> >
>>> > [1]
>>> https://github.com/wch/r-source/blob/tags/R-4-0-0/src/main/summary.c#L482
>>> >
>>> > > > a2=(z[idx]+x[idx]+y[idx])/3
>>> > > > a2==a
>>> > > [1] FALSE
>>> > > > a2==b
>>> > > [1] TRUE
>>> > >
>>> > > -pd
>>> > >
>>> > > > On 20 May 2020, at 12:40 , Morgan Morgan <
>>> morgan.email...@gmail.com> wrote:
>>> > > >
>>> > > > Hello R-dev,
>>> > > >
>>> > > > Yesterday, while I was testing the newly implemented function
>>> pmean in
>>> > > > package kit, I noticed a mismatch in the output of the below R
>>> expressions.
>>> > > >
>>> > > > set.seed(123)
>>> > > > n=1e3L
>>> > > > idx=5
>>> > > > x=rnorm(n)
>>> > > > y=rnorm(n)
>>> > > > z=rnorm(n)
>>> > > > a=(x[idx]+y[idx]+z[idx])/3
>>> > > > b=mean(c(x[idx],y[idx],z[idx]))
>>> > > > a==b
>>> > > > # [1] FALSE
>>> > > >
>>> > > > For idx= 1, 2, 3, 4 the last line is equal to TRUE. For 5, 6 and
>>> many
>>> > > > others the difference is small but still.
>>> > > > Is that expected or is it a bug?
>>> >
>>> > __
>>> > R-devel@r-project.org mailing list
>>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] subset data.frame at C level

2020-06-17 Thread Morgan Morgan

Hi,

Hope you are well.

I was wondering if there is a function at C level that is equivalent to
mtcars$carb or .subset2(mtcars, "carb").

If I have the index of the column then the answer would be VECTOR_ELT(df,
asInteger(idx)) but I was wondering if there is a way to do it directly
from the name of the column without having to loop over columns names to
find the index?

Thank you
Best regards
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] subset data.frame at C level

2020-06-24 Thread Morgan Morgan

Thank you Jim for the feedback.

I actually implemented it the way I describe it in my first email and it
seems fast enough for me.

Just to give a bit of context I will need it at some point in package kit.
I also implemented subset by row which I actually need more as I am working
on a faster version of the unique and duplicated function. The function
unique is particularly slow for data.frame. So far I got a 100x speedup.

Best regards
Morgan


On Tue, 23 Jun 2020, 21:11 Jim Hester,  wrote:

> It looks to me like internally .subset2 uses `get1index()`, but this
> function is declared in Defn.h, which AFAIK is not part of the exported R
> API.
>
>  Looking at the code for `get1index()` it looks like it just loops over
> the (translated) names, so I guess I just do that [0].
>
> [0]:
> https://github.com/r-devel/r-svn/blob/1ff1d4197495a6ee1e1d88348a03ff841fd27608/src/main/subscript.c#L226-L235
>
> On Wed, Jun 17, 2020 at 6:11 AM Morgan Morgan 
> wrote:
>
>> Hi,
>>
>> Hope you are well.
>>
>> I was wondering if there is a function at C level that is equivalent to
>> mtcars$carb or .subset2(mtcars, "carb").
>>
>> If I have the index of the column then the answer would be VECTOR_ELT(df,
>> asInteger(idx)) but I was wondering if there is a way to do it directly
>> from the name of the column without having to loop over columns names to
>> find the index?
>>
>> Thank you
>> Best regards
>> Morgan
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Build a R call at C level

2020-06-30 Thread Morgan Morgan

Hi All,

I was reading the R extension manual section 5.11 ( Evaluating R expression
from C) and I tried to build a simple call to the sum function. Please see
below.

call_to_sum <- inline::cfunction(
  language = "C",
  sig = c(x = "SEXP"), body = "

SEXP e = PROTECT(lang2(install(\"sum\"), x));
SEXP ans = PROTECT(eval(e, R_GlobalEnv));
UNPROTECT(2);
return ans;

")

call_to_sum(1:3)

The above works. My question is how do I add the argument "na.rm=TRUE" at C
level to the above call? I have tried various things based on what is in
section 5.11 but I did not manage to get it to work.

Thank you
Best regards

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Build a R call at C level

2020-06-30 Thread Morgan Morgan

Thanks Jan and Tomas for the feedback.
Answer from Jan is what I am looking for.
Maybe I am not looking in the right place buy it is not easy to understand
how these LCONS, CONS, SETCDR...etc works.

Thank you
Best regards
Morgan



On Tue, 30 Jun 2020, 12:36 Tomas Kalibera,  wrote:

> On 6/30/20 1:06 PM, Jan Gorecki wrote:
> > It is quite known that R documentation on R C api could be improved...
>
> Please see "5.11 Evaluating R expressions from C" from "Writing R
> Extensions"
>
> Best
> Tomas
>
> > Still R-package-devel mailing list should be preferred for this kind
> > of questions.
> > Not sure if that is the best way, but works.
> >
> > call_to_sum <- inline::cfunction(
> >language = "C",
> >sig = c(x = "SEXP"), body = "
> >
> > SEXP e = PROTECT(lang2(install(\"sum\"), x));
> > SEXP r_true = PROTECT(CONS(ScalarLogical(1), R_NilValue));
> > SETCDR(CDR(e), r_true);
> > SET_TAG(CDDR(e), install(\"na.rm\"));
> > Rf_PrintValue(e);
> > SEXP ans = PROTECT(eval(e, R_GlobalEnv));
> > UNPROTECT(3);
> > return ans;
> >
> > ")
> >
> > call_to_sum(c(1L,NA,3L))
> >
> > On Tue, Jun 30, 2020 at 10:08 AM Morgan Morgan
> >  wrote:
> >> Hi All,
> >>
> >> I was reading the R extension manual section 5.11 ( Evaluating R
> expression
> >> from C) and I tried to build a simple call to the sum function. Please
> see
> >> below.
> >>
> >> call_to_sum <- inline::cfunction(
> >>language = "C",
> >>sig = c(x = "SEXP"), body = "
> >>
> >> SEXP e = PROTECT(lang2(install(\"sum\"), x));
> >> SEXP ans = PROTECT(eval(e, R_GlobalEnv));
> >> UNPROTECT(2);
> >> return ans;
> >>
> >> ")
> >>
> >> call_to_sum(1:3)
> >>
> >> The above works. My question is how do I add the argument "na.rm=TRUE"
> at C
> >> level to the above call? I have tried various things based on what is in
> >> section 5.11 but I did not manage to get it to work.
> >>
> >> Thank you
> >> Best regards
> >>
> >>  [[alternative HTML version deleted]]
> >>
> >> __
> >> R-devel@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Build a R call at C level

2020-06-30 Thread Morgan Morgan

Sorry Dirk, I don't remember discussing this topic or alternatives with you
at all.
Have a nice day.

On Tue, 30 Jun 2020, 14:42 Morgan Morgan,  wrote:

> Thanks Jan and Tomas for the feedback.
> Answer from Jan is what I am looking for.
> Maybe I am not looking in the right place buy it is not easy to understand
> how these LCONS, CONS, SETCDR...etc works.
>
> Thank you
> Best regards
> Morgan
>
>
>
> On Tue, 30 Jun 2020, 12:36 Tomas Kalibera, 
> wrote:
>
>> On 6/30/20 1:06 PM, Jan Gorecki wrote:
>> > It is quite known that R documentation on R C api could be improved...
>>
>> Please see "5.11 Evaluating R expressions from C" from "Writing R
>> Extensions"
>>
>> Best
>> Tomas
>>
>> > Still R-package-devel mailing list should be preferred for this kind
>> > of questions.
>> > Not sure if that is the best way, but works.
>> >
>> > call_to_sum <- inline::cfunction(
>> >language = "C",
>> >sig = c(x = "SEXP"), body = "
>> >
>> > SEXP e = PROTECT(lang2(install(\"sum\"), x));
>> > SEXP r_true = PROTECT(CONS(ScalarLogical(1), R_NilValue));
>> > SETCDR(CDR(e), r_true);
>> > SET_TAG(CDDR(e), install(\"na.rm\"));
>> > Rf_PrintValue(e);
>> > SEXP ans = PROTECT(eval(e, R_GlobalEnv));
>> > UNPROTECT(3);
>> > return ans;
>> >
>> > ")
>> >
>> > call_to_sum(c(1L,NA,3L))
>> >
>> > On Tue, Jun 30, 2020 at 10:08 AM Morgan Morgan
>> >  wrote:
>> >> Hi All,
>> >>
>> >> I was reading the R extension manual section 5.11 ( Evaluating R
>> expression
>> >> from C) and I tried to build a simple call to the sum function. Please
>> see
>> >> below.
>> >>
>> >> call_to_sum <- inline::cfunction(
>> >>language = "C",
>> >>sig = c(x = "SEXP"), body = "
>> >>
>> >> SEXP e = PROTECT(lang2(install(\"sum\"), x));
>> >> SEXP ans = PROTECT(eval(e, R_GlobalEnv));
>> >> UNPROTECT(2);
>> >> return ans;
>> >>
>> >> ")
>> >>
>> >> call_to_sum(1:3)
>> >>
>> >> The above works. My question is how do I add the argument "na.rm=TRUE"
>> at C
>> >> level to the above call? I have tried various things based on what is
>> in
>> >> section 5.11 but I did not manage to get it to work.
>> >>
>> >> Thank you
>> >> Best regards
>> >>
>> >>  [[alternative HTML version deleted]]
>> >>
>> >> __
>> >> R-devel@r-project.org mailing list
>> >> https://stat.ethz.ch/mailman/listinfo/r-devel
>> > __
>> > R-devel@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Faster sorting algorithm...

2021-03-15 Thread Morgan Morgan

Hi,
I am not sure if this is the right mailing list, so apologies in advance if
it is not.

I found the following link/presentation:
https://www.r-project.org/dsc/2016/slides/ParallelSort.pdf

The implementation of fsort is interesting but incomplete (not sure why?)
and can be improved or made faster (at least 25%  I believe). I might be
wrong but there are maybe a couple of bugs as well.

My questions are:

1/ Is the R Core team interested in a faster sorting algo? (Multithread or
even single threaded)

2/ I see an issue with the license, which is MPL-2.0, and hence not
compatible with base R, Python and Julia. Is there an interest to change
the license of fsort so all 3 languages (and all the people using these
languages) can benefit from it? (Like suggested on the first page)

Please let me know if there is an interest to address the above points, I
would be happy to look into it (free of charge of course!).

Thank you
Best regards
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Faster sorting algorithm...

2021-03-15 Thread Morgan Morgan

Default method for sort is not radix(especially for character vector). You
might want to read the documentation of sort.
For your second question, I invite you to look at the code of fsort. It is
implemented only for positive finite double, and default to
data.table:::forder ... when the types are different than positive double...
Please read the pdf link I sent, everything is explained in it.
Thank you
Morgan

On Mon, 15 Mar 2021, 16:52 Avraham Adler,  wrote:

> Isn’t the default method now “radix” which is the data.table sort, and
> isn’t that already parallel using openmp where available?
>
> Avi
>
> On Mon, Mar 15, 2021 at 12:26 PM Morgan Morgan 
> wrote:
>
>> Hi,
>> I am not sure if this is the right mailing list, so apologies in advance
>> if
>> it is not.
>>
>> I found the following link/presentation:
>> https://www.r-project.org/dsc/2016/slides/ParallelSort.pdf
>>
>> The implementation of fsort is interesting but incomplete (not sure why?)
>> and can be improved or made faster (at least 25%  I believe). I might be
>> wrong but there are maybe a couple of bugs as well.
>>
>> My questions are:
>>
>> 1/ Is the R Core team interested in a faster sorting algo? (Multithread or
>> even single threaded)
>>
>> 2/ I see an issue with the license, which is MPL-2.0, and hence not
>> compatible with base R, Python and Julia. Is there an interest to change
>> the license of fsort so all 3 languages (and all the people using these
>> languages) can benefit from it? (Like suggested on the first page)
>>
>> Please let me know if there is an interest to address the above points, I
>> would be happy to look into it (free of charge of course!).
>>
>> Thank you
>> Best regards
>> Morgan
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
> --
> Sent from Gmail Mobile
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Faster sorting algorithm...

2021-03-17 Thread Morgan Morgan

Thank you Neal. This is interesting. I will have a look at pqR.
Indeed radix only does C collation, I believe that is why it is not the
default choice for character ordering and sorting.
Not sure but I believe it can help address the following bugzilla item:
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17400

On the same topic of collation, there is an experimental sorting function
"psort" in package kit that might help address this issue.

> library(kit)
Attaching kit 0.0.7 (OPENMP enabled using 1 thread)
> x <- c("b","A","B","a","\xe4")
> Encoding(x) <- "latin1"
> identical(psort(x, c.locale=FALSE), sort(x))
[1] TRUE
> identical(psort(x, c.locale=TRUE), sort(x, method="radix"))
[1] TRUE

Coming back to the topic of fsort, I have just finished the implementation
for double, integer, factor and logical.
The implementation takes into account NA, Inf.. values. Values can be
sorted in a decreasing order or increasing order.
Comparing benchmark with the current implementation in data.table, it is
currently over 30% faster.
There might bugs but I am sure performance can be further improved as I did
not really try hard.
If there is interest in both the implementation and cross community
sharing, please let know

Best regards,
Morgan

On Wed, 17 Mar 2021, 00:37 Radford Neal,  wrote:

> Those interested in faster sorting may want to look at the merge sort
> implemented in pqR (see pqR-project.org).  It's often used as the
> default, because it is stable, and does different collations, while
> being faster than shell sort (except for small vectors).
>
> Here are examples, with timings, for pqR-2020-07-23 and R-4.0.2,
> compiled identically:
>
> -
> pqR-2020-07-23 in C locale:
>
> > set.seed(1)
> > N <- 100
> > x <- as.character (sample(N,N,replace=TRUE))
> > print(system.time (os <- order(x,method="shell")))
>user  system elapsed
>   1.332   0.000   1.334
> > print(system.time (or <- order(x,method="radix")))
>user  system elapsed
>   0.092   0.004   0.096
> > print(system.time (om <- order(x,method="merge")))
>user  system elapsed
>   0.363   0.000   0.363
> > print(identical(os,or))
> [1] TRUE
> > print(identical(os,om))
> [1] TRUE
> >
> > x <- c("a","~")
> > print(order(x,method="shell"))
> [1] 1 2
> > print(order(x,method="radix"))
> [1] 1 2
> > print(order(x,method="merge"))
> [1] 1 2
>
> -
> R-4.0.2 in C locale:
>
> > set.seed(1)
> > N <- 100
> > x <- as.character (sample(N,N,replace=TRUE))
> > print(system.time (os <- order(x,method="shell")))
>user  system elapsed
>   2.381   0.004   2.387
> > print(system.time (or <- order(x,method="radix")))
>user  system elapsed
>   0.138   0.000   0.137
> > #print(system.time (om <- order(x,method="merge")))
> > print(identical(os,or))
> [1] TRUE
> > #print(identical(os,om))
> >
> > x <- c("a","~")
> > print(order(x,method="shell"))
> [1] 1 2
> > print(order(x,method="radix"))
> [1] 1 2
> > #print(order(x,method="merge"))
>
> 
> pqR-2020-07-23 in fr_CA.utf8 locale:
>
> > set.seed(1)
> > N <- 100
> > x <- as.character (sample(N,N,replace=TRUE))
> > print(system.time (os <- order(x,method="shell")))
> utilisateur système  écoulé
>   2.960   0.000   2.962
> > print(system.time (or <- order(x,method="radix")))
> utilisateur système  écoulé
>   0.083   0.008   0.092
> > print(system.time (om <- order(x,method="merge")))
> utilisateur système  écoulé
>   1.143   0.000   1.142
> > print(identical(os,or))
> [1] TRUE
> > print(identical(os,om))
> [1] TRUE
> >
> > x <- c("a","~")
> > print(order(x,method="shell"))
> [1] 2 1
> > print(order(x,method="radix"))
> [1] 1 2
> > print(order(x,method="merge"))
> [1] 2 1
>
> 
> R-4.0.2 in fr_CA.utf8 locale:
>
> > set.seed(1)
> > N <- 100
> > x <- as.character (sample(N,N,replace=TRUE))
> > print(system.time (os <- order(x,method="shell")))
> utilisateur système  écoulé
>   4.222   0.016   4.239
> > print(system.time (or <- order(x,method="radix"

Re: [Rd] Faster sorting algorithm...

2021-03-21 Thread Morgan Morgan

My apologies to Professor Neal.
Thank you for correcting me.
Best regards
Morgan


On Mon, 22 Mar 2021, 05:05 ,  wrote:

> I think it is "Professor Neal" :)
>
> I also appreciate the pqR comparisons.
>
> On Wed, Mar 17, 2021 at 09:23:15AM +, Morgan Morgan wrote:
> >Thank you Neal. This is interesting. I will have a look at pqR.
> >Indeed radix only does C collation, I believe that is why it is not the
> >default choice for character ordering and sorting.
> >Not sure but I believe it can help address the following bugzilla item:
> >https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17400
> >
> >On the same topic of collation, there is an experimental sorting function
> >"psort" in package kit that might help address this issue.
> >
> >> library(kit)
> >Attaching kit 0.0.7 (OPENMP enabled using 1 thread)
> >> x <- c("b","A","B","a","\xe4")
> >> Encoding(x) <- "latin1"
> >> identical(psort(x, c.locale=FALSE), sort(x))
> >[1] TRUE
> >> identical(psort(x, c.locale=TRUE), sort(x, method="radix"))
> >[1] TRUE
> >
> >Coming back to the topic of fsort, I have just finished the implementation
> >for double, integer, factor and logical.
> >The implementation takes into account NA, Inf.. values. Values can be
> >sorted in a decreasing order or increasing order.
> >Comparing benchmark with the current implementation in data.table, it is
> >currently over 30% faster.
> >There might bugs but I am sure performance can be further improved as I
> did
> >not really try hard.
> >If there is interest in both the implementation and cross community
> >sharing, please let know
> >
> >Best regards,
> >Morgan
> >
> >On Wed, 17 Mar 2021, 00:37 Radford Neal,  wrote:
> >
> >> Those interested in faster sorting may want to look at the merge sort
> >> implemented in pqR (see pqR-project.org).  It's often used as the
> >> default, because it is stable, and does different collations, while
> >> being faster than shell sort (except for small vectors).
> >>
> >> Here are examples, with timings, for pqR-2020-07-23 and R-4.0.2,
> >> compiled identically:
> >>
> >> -
> >> pqR-2020-07-23 in C locale:
> >>
> >> > set.seed(1)
> >> > N <- 100
> >> > x <- as.character (sample(N,N,replace=TRUE))
> >> > print(system.time (os <- order(x,method="shell")))
> >>user  system elapsed
> >>   1.332   0.000   1.334
> >> > print(system.time (or <- order(x,method="radix")))
> >>user  system elapsed
> >>   0.092   0.004   0.096
> >> > print(system.time (om <- order(x,method="merge")))
> >>user  system elapsed
> >>   0.363   0.000   0.363
> >> > print(identical(os,or))
> >> [1] TRUE
> >> > print(identical(os,om))
> >> [1] TRUE
> >> >
> >> > x <- c("a","~")
> >> > print(order(x,method="shell"))
> >> [1] 1 2
> >> > print(order(x,method="radix"))
> >> [1] 1 2
> >> > print(order(x,method="merge"))
> >> [1] 1 2
> >>
> >> -
> >> R-4.0.2 in C locale:
> >>
> >> > set.seed(1)
> >> > N <- 100
> >> > x <- as.character (sample(N,N,replace=TRUE))
> >> > print(system.time (os <- order(x,method="shell")))
> >>user  system elapsed
> >>   2.381   0.004   2.387
> >> > print(system.time (or <- order(x,method="radix")))
> >>user  system elapsed
> >>   0.138   0.000   0.137
> >> > #print(system.time (om <- order(x,method="merge")))
> >> > print(identical(os,or))
> >> [1] TRUE
> >> > #print(identical(os,om))
> >> >
> >> > x <- c("a","~")
> >> > print(order(x,method="shell"))
> >> [1] 1 2
> >> > print(order(x,method="radix"))
> >> [1] 1 2
> >> > #print(order(x,method="merge"))
> >>
> >> 
> >> pqR-2020-07-23 in fr_CA.utf8 locale:
> >>
> >> > set.seed(1)
> >> > N <- 100
> >> > x <- as.character (sample(N,N,replace=TRUE))
> >> > print(system.t

[Rd] R Console Bug?

2021-04-16 Thread Morgan Morgan

Hi,

I am getting a really weird behaviour with the R console.
Here is the code to reproduce it.

1/ C code: ---

SEXP printtest(SEXP x) {
  const int PBWIDTH = 30, loop = INTEGER(x)[0];
  int val, lpad;
  double perc;
  char PBSTR[PBWIDTH], PBOUT[PBWIDTH];
  memset(PBSTR,'=', sizeof(PBSTR));
  memset(PBOUT,'-', sizeof(PBOUT));
  for (int k = 0; k < 3; ++k) {
REprintf("\n   Processing data chunk %d of 3\n",k+1);
for (int i = 0; i < loop; ++i) {
  perc = (double) i/(loop-1);
  val  = (int) (perc * 100);
  lpad = (int) (perc * PBWIDTH);
  REprintf("\r [%.*s%.*s] %3d%%", lpad, PBSTR, PBWIDTH - lpad, PBOUT,
val);
  R_FlushConsole();
}
REprintf("\n");
  }
  return R_NilValue;
}

2/ Build so/dll: ---

R CMD SHLIB

3/ Run code :  ---

dyn.load("test.so")
.Call("printtest",1e4L)
dyn.unload("test.so")

4/ Issue:  ---
If you run the above code in RStudio, it works well both on Mac and Windows.
If you run it in Windows cmd, it is slow.
If you run it in Windows RGui, it is slow but also all texts are flushed.
If you run it in Mac terminal, it runs perfectly.
If you run it in Mac R Console, it prints something like :
> .Call("printtest",1e4L)
 [==] 100%NULL]   0%

I am using R 4.0.4 (Mac) / 4.0.5 (Windows)

Is that a bug or am I doing something wrong?

Thank you
Best regards,
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R Console Bug?

2021-04-17 Thread Morgan Morgan

Hi Simon,
Thank you for the feedback.
It is really strange that you have a different output.
I have attached a picture of my R console.
I am just trying to port some pure C code that prints progress bars to R
but it does not seem to be printing properly.
It seems I am doing something wrong with REprintf and R_FlushConsole.
Best regards,
Morgan

On Sat, Apr 17, 2021 at 12:36 AM Simon Urbanek 
wrote:

> Sorry, unable to reproduce on macOS, in R console:
>
> > dyn.load("test.so")
> > .Call("printtest",1e4L)
>
>Processing data chunk 1 of 3
>  [==] 100%
>
>Processing data chunk 2 of 3
>  [==] 100%
>
>Processing data chunk 3 of 3
>  [==] 100%
> NULL
>
> But honestly I'm not sure sure I understand the report. R_FlushConsole is
> a no-op for terminal console and your code just prints on stderr anyway
> (which is not buffered). All this does is just a lot of \r output (which is
> highly inefficient anywhere but in Terminal by definition). Can you clarify
> what the code tries to trigger?
>
> Cheers,
> Simon
>
>
> > On Apr 16, 2021, at 23:11, Morgan Morgan 
> wrote:
> >
> > Hi,
> >
> > I am getting a really weird behaviour with the R console.
> > Here is the code to reproduce it.
> >
> > 1/ C code: ---
> >
> > SEXP printtest(SEXP x) {
> >  const int PBWIDTH = 30, loop = INTEGER(x)[0];
> >  int val, lpad;
> >  double perc;
> >  char PBSTR[PBWIDTH], PBOUT[PBWIDTH];
> >  memset(PBSTR,'=', sizeof(PBSTR));
> >  memset(PBOUT,'-', sizeof(PBOUT));
> >  for (int k = 0; k < 3; ++k) {
> >REprintf("\n   Processing data chunk %d of 3\n",k+1);
> >for (int i = 0; i < loop; ++i) {
> >  perc = (double) i/(loop-1);
> >  val  = (int) (perc * 100);
> >  lpad = (int) (perc * PBWIDTH);
> >  REprintf("\r [%.*s%.*s] %3d%%", lpad, PBSTR, PBWIDTH - lpad, PBOUT,
> > val);
> >  R_FlushConsole();
> >}
> >REprintf("\n");
> >  }
> >  return R_NilValue;
> > }
> >
> > 2/ Build so/dll: ---
> >
> > R CMD SHLIB
> >
> > 3/ Run code :  ---
> >
> > dyn.load("test.so")
> > .Call("printtest",1e4L)
> > dyn.unload("test.so")
> >
> > 4/ Issue:  ---
> > If you run the above code in RStudio, it works well both on Mac and
> Windows.
> > If you run it in Windows cmd, it is slow.
> > If you run it in Windows RGui, it is slow but also all texts are flushed.
> > If you run it in Mac terminal, it runs perfectly.
> > If you run it in Mac R Console, it prints something like :
> >> .Call("printtest",1e4L)
> > [==] 100%NULL]
>  0%
> >
> > I am using R 4.0.4 (Mac) / 4.0.5 (Windows)
> >
> > Is that a bug or am I doing something wrong?
> >
> > Thank you
> > Best regards,
> > Morgan
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
>
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] How to get utf8 string using R externals

2021-06-02 Thread Morgan Morgan

On Wed, 2 Jun 2021, 22:31 Duncan Murdoch,  wrote:

> On 02/06/2021 4:33 p.m., xiaoyan yu wrote:
> > I have a R Script Predict.R:
> >  set.seed(42)
> >  C <- seq(1:1000)
> >  A <- rep(seq(1:200),5)
> >  E <- (seq(1:1000) * (0.8 + (0.4*runif(50, 0, 1
> >  L <- ifelse(runif(1000)>.5,1,0)
> >  df <- data.frame(cbind(C, A, E, L))
> > load("C:/Temp/tree.RData")#  load the model for scoring
> >
> >P <- as.character(predict(tree_model_1,df,type='class'))
> >
> > Then in a C++ program
> > I call eval to evaluate the script and then findVar the P variable.
> > After get each class label from P using string_elt and then
> > Rf_translateChar, the characters are unicodes () instead
> of
> > utf8 encoding of the korean characters 부실.
> > Can I know how to get UTF8 by using R externals?
> >
> > I also found the same script giving utf8 characters in RGui but unicode
> in
> > Rterm.
> > I tried to attach a screenshot but got message "The message's content
> type
> > was not explicitly allowed"
> > In RGui, I saw the output 부실, while in Rterm, .
>
> Sounds like you're using Windows.  Stop doing that.
>
> Duncan Murdoch
>

Could as well say: "Sounds like you are using R. Stop doing that." Start
using Julia. ;-)



> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Question about R developpment

2017-06-10 Thread Morgan

Hi,

I had a question that might not seem obvious to me.

I was wondering why there was no patnership between microsoft the R core
team and eventually other developpers to improve R in one unified version
instead of having different teams developping their own version of R.

Is it because they don't want to team up? Is it because you don't want? Any
particular reasons? Different philosophies?

Thank you
Kind regards
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Question about R developpment

2017-06-12 Thread Morgan

Thank you all for these explanations.
Kind regards,
Morgan

On 11 Jun 2017 02:47, "Duncan Murdoch"  wrote:

> On 10/06/2017 6:09 PM, Duncan Murdoch wrote:
>
>> On 10/06/2017 2:38 PM, Morgan wrote:
>>
>>> Hi,
>>>
>>> I had a question that might not seem obvious to me.
>>>
>>> I was wondering why there was no patnership between microsoft the R core
>>> team and eventually other developpers to improve R in one unified version
>>> instead of having different teams developping their own version of R.
>>>
>>
>> As far as I know, there's only one version of R currently being
>> developed.  Microsoft doesn't offer anything different; they just offer
>> a build of a slightly older version of base R, and a few packages that
>> are not in the base version.
>>
>
> Actually, I think my first sentence above is wrong.  Besides the base R
> that the core R team works on, there are a few other implementations of the
> language:  pqR, for instance.  But as others have said, the Microsoft
> product is simply a repackaging of the core R, so my second sentence is
> right.
>
> Duncan Murdoch
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] winMenuAdd

2005-11-22 Thread Martin Morgan

The following

winMenuAdd("X")
for (i in 1:20) winMenuAdd(paste("X",i, sep="/"))

generates an (incorrect) error after adding 12 menu items:

Error in winMenuAdd(menuname, NULL, NULL) : 
unable to add menu (base menu does not exist)

More elaborate examples (e.g., adding menu items to each menu) create
other errors (e.g., "Only 16 menus are allowed"), and the original
example (at
https://stat.ethz.ch/pipermail/bioconductor/2005-November/011010.html)
crashes with SIGSEGV in rui.c:1389. I think the basic problem is that
there is a hard-coded limit of 16 menus. The limit is reached in
Bioconductor, as packages add vignettes.

R version 2.2.0, 2005-11-21, i386-pc-mingw32 

attached base packages:
[1] "methods"   "stats" "graphics"  "grDevices" "utils" "datasets" 
[7] "base" 


-- 
Martin Morgan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Windows R CMD build leftovers

2005-11-24 Thread Martin Morgan

A command

R CMD build  

that fails, e.g., because of C code compilation errors, leaves a
directory %TMPDIR%/Rinst.xxx containing the file R.css. Although R
CMD INSTALL --build cleans up after itself, build does not. A fix is
below. Also, build.in references Rcmd.exe, which I thought was no
longer necessary?

Index: build.in
===
--- build.in(revision 36450)
+++ build.in(working copy)
@@ -434,6 +434,8 @@
if($doit && R_system($cmd)) {
$log->error();
$log->print("Installation failed.\n");
+   $log->print("Removing '$libdir'\n");
+   rmtree($libdir);
exit(1);
}
my $R_LIBS = $ENV{'R_LIBS'};

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] (not just!) Windows R CMD build leftovers

2005-12-01 Thread Martin Morgan

Perhaps this earlier post slipped through the cracks? My apologies if
it's still 'in process', or I missed a response, or if the
contribution isn't helpful.

At any rate, I realized that the problem is not windows-specific.

Also, generating $libdir by calling (a sligthly modified) R_tempfile
might give installation more of a fighting chance in a cluttered TMPDIR.

Index: src/scripts/build.in
===
--- src/scripts/build.in(revision 36565)
+++ src/scripts/build.in(working copy)
@@ -76,7 +76,7 @@
 my $R_platform = R_getenv("R_PLATFORM", "unknown-binary");
 my $gzip = R_getenv("R_GZIPCMD", "gzip");
 my $tar = R_getenv("TAR", "tar");
-my $libdir = &file_path(${R::Vars::TMPDIR}, "Rinst.$$");
+my $libdir = R_tempfile("Rinst.");
 
 my $INSTALL_opts = "";
 $INSTALL_opts .= " --use-zip" if $opt_use_zip;
@@ -434,6 +434,8 @@
if($doit && R_system($cmd)) {
$log->error();
$log->print("Installation failed.\n");
+   $log->print("Removing '$libdir'\n");
+   rmtree($libdir);
exit(1);
}
my $R_LIBS = $ENV{'R_LIBS'};
Index: share/perl/R/Utils.pm
===
--- share/perl/R/Utils.pm   (revision 36565)
+++ share/perl/R/Utils.pm   (working copy)
@@ -75,7 +75,7 @@
   $pat . $$ . sprintf("%05d", rand(10**5)));
 
 my $n=0;
-while(-f $retval){
+while(-e $retval){
$retval = file_path($R::Vars::TMPDIR,
$pat . $$ . sprintf("%05d", rand(10**5)));
croak "Cannot find unused name for temporary file"


Martin Morgan <[EMAIL PROTECTED]> writes:

> A command
>
> R CMD build  
>
> that fails, e.g., because of C code compilation errors, leaves a
> directory %TMPDIR%/Rinst.xxx containing the file R.css. Although R
> CMD INSTALL --build cleans up after itself, build does not. A fix is
> below. Also, build.in references Rcmd.exe, which I thought was no
> longer necessary?
>
> Index: build.in
> ===
> --- build.in  (revision 36450)
> +++ build.in  (working copy)
> @@ -434,6 +434,8 @@
>   if($doit && R_system($cmd)) {
>   $log->error();
>   $log->print("Installation failed.\n");
> + $log->print("Removing '$libdir'\n");
> + rmtree($libdir);
>   exit(1);
>   }
>   my $R_LIBS = $ENV{'R_LIBS'};
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] can someone help me understand LAM/MPI and Rmpi for use on a cluster

2006-01-03 Thread Martin Morgan

Here's my slow response; if there were other off-list replies it would
be great to have a summary.

Not exactly sure what you're looking for. You might adopt a parallel
program so that the 'master' node does something like

myprog.c:

  MPI_Init(...)

  /* parallel computations, e.g., of pi */

  if (myid == 0) {
MPI_Comm_get_parent(&parent);
MPI_Send( &pi, 1, MPI_DOUBLE, 0, 0, parent );
  }

  MPI_Finalize()

then in R

  library(Rmpi)
  mpi.comm.spawn("myprog", ...)
  mpi.recv(...)

This launches myprog as a child of the R process, and retrieves the
result via the send/receive exchange between the spawned program and
R. An extension of this would move the mpi.comm.spawn call into a C
function you'd invoked from R with .Call(...).

This could also be developed into a kind of 'shell' package
initializing MPI and then providing parallelized functions and a
light-weight mechanism for their dispatch.

Hope that's helpful and not too misleading.

Martin

"Izmirlian, Grant (NIH/NCI) [E]" <[EMAIL PROTECTED]> writes:

> I'm fairly astute at C and R but new to parallelization. Would someone
> be willing to provide help in the form of a simple example that parallelizes
> an R function from the inside of a C routine?
>
> If so, write me back at [EMAIL PROTECTED]
>
> Thanks!
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] prod(numeric(0)) surprise

2006-01-09 Thread Martin Morgan

I'm a little confused. I understand that numeric(0) means an empty
numeric vector, not the number 0 expressed as numeric. As it is now,
prod(numeric(0)) generates something -- a vector of length 1
containing the number 1 -- from nothing. I would have expected

prod(numeric(0)) ==> numeric(0)

this is consistent with

numeric(0) ==> numeric(0)
numeric(0) * 1 ==> numeric(0)
cumprod(numeric(0)) ==> numeric(0)

and, because concatenation occus before function evaluation,

prod(c(numeric(0),1)) ==> prod( c(1) ) ==> 1

I would expect sum() to behave the same way, e.g., sum(numeric(0)) ==>
numeric(0). From below,

> >>>> consider exp(sum(log(numeric(0 ... ?)
> >> 
> >> That's a fairly standard mathematical convention, which
> >> is presumably why sum and prod work that way.
> >> 
> >> Duncan Murdoch

I would have expected numeric(0) as the result (numeric(0) is the
result from log(numeric(0)), etc).

Martin (Morgan)


Martin Maechler <[EMAIL PROTECTED]> writes:

>>>>>> "Ben" == Ben Bolker <[EMAIL PROTECTED]>
>>>>>> on Sun, 08 Jan 2006 21:40:05 -0500 writes:
>
> Ben> Duncan Murdoch wrote:
> >> On 1/8/2006 9:24 PM, Ben Bolker wrote:
> >> 
> >>> It surprised me that prod(numeric(0)) is 1.  I guess if
> >>> you say (operation(nothing) == identity element) this
> >>> makes sense, but ??
> >> 
> >> 
> >> What value were you expecting, or were you expecting an
> >> error?  I can't think how any other value could be
> >> justified, and throwing an error would make a lot of
> >> formulas more complicated.
> >> 
> >>>
> >> 
> >>>> consider exp(sum(log(numeric(0 ... ?)
> >> 
> >> That's a fairly standard mathematical convention, which
> >> is presumably why sum and prod work that way.
> >> 
> >> Duncan Murdoch
>
> Ben>OK.  I guess I was expecting NaN/NA (as opposed to
> Ben> an error), but I take the "this makes everything else
> Ben> more complicated" point.  Should this be documented or
> Ben> is it just too obvious ... ?  (Funny -- I'm willing to
> Ben> take gamma(1)==1 without any argument or suggestion
> Ben> that it should be documented ...)
>
> see?  so it looks to me as if you have finally convinced
> yourself that '1' is the most reasonable result.. ;-)
>
> Anyway, I've added a sentence to help(prod)  {which matches
> the sentence in help(sum), BTW}.
>
> Martin
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] prod(numeric(0)) surprise

2006-01-09 Thread Martin Morgan

I guess I have to say yes, I'd exepct

x <- 1:10
sum(x[x>10]) ==> numeric(0)

this would be reinforced by recongnizing that numeric(0) is not zero,
but nothing. I guess the summation over an empty set is an empty set,
rather than a set containing the number 0. Certainly these

exp(x[x>10]) ==> numeric(0)
numeric(0) + 1 ==> numeric(0)

would give me pause.


Gabor Grothendieck <[EMAIL PROTECTED]> writes:

> The way to think about it is:
>
>prod(rep(x,n)) == x^n
>
> and that works for n=0 too.

Hmm, Not sure what to put in for x and n? do you mean x == numeric(0),
n == 0 (0 copies of an empty set), x == ANY n == numeric(0) (an empty
set of ANYthing), x == numeric(0), n == numeric(0) ? For all of these,
x^n evaluates to numeric(0).

Martin (Morgan)

Duncan Murdoch <[EMAIL PROTECTED]> writes:

> On 1/9/2006 12:40 PM, Martin Morgan wrote:
>> I'm a little confused. I understand that numeric(0) means an empty
>> numeric vector, not the number 0 expressed as numeric. As it is now,
>> prod(numeric(0)) generates something -- a vector of length 1
>> containing the number 1 -- from nothing. I would have expected
>> prod(numeric(0)) ==> numeric(0)
>> this is consistent with
>> numeric(0) ==> numeric(0)
>> numeric(0) * 1 ==> numeric(0)
>> cumprod(numeric(0)) ==> numeric(0)
>> and, because concatenation occus before function evaluation,
>> prod(c(numeric(0),1)) ==> prod( c(1) ) ==> 1
>> I would expect sum() to behave the same way, e.g., sum(numeric(0))
>> ==>
>> numeric(0). From below,
>>
>
> I think the code below works as I'd expect.  Would you really like the
> last answer to be numeric(0)?
>
>  > x <- 1:10
>  > sum(x)
> [1] 55
>  > sum(x[x>5])
> [1] 40
>  > sum(x[x>10])
> [1] 0
>
> Duncan Murdoch
>
>>> >>>> consider exp(sum(log(numeric(0 ... ?)
>>> >>     >> That's a fairly standard mathematical convention,
>>> which
>>> >> is presumably why sum and prod work that way.
>>> >> >> Duncan Murdoch
>> I would have expected numeric(0) as the result (numeric(0) is the
>> result from log(numeric(0)), etc).
>> Martin (Morgan)
>> Martin Maechler <[EMAIL PROTECTED]> writes:
>>
>>>>>>>> "Ben" == Ben Bolker <[EMAIL PROTECTED]>
>>>>>>>> on Sun, 08 Jan 2006 21:40:05 -0500 writes:
>>>
>>> Ben> Duncan Murdoch wrote:
>>> >> On 1/8/2006 9:24 PM, Ben Bolker wrote:
>>> >> >>> It surprised me that prod(numeric(0)) is 1.  I guess
>>> if
>>> >>> you say (operation(nothing) == identity element) this
>>> >>> makes sense, but ??
>>> >> >> >> What value were you expecting, or were you
>>> expecting an
>>> >> error?  I can't think how any other value could be
>>> >> justified, and throwing an error would make a lot of
>>> >> formulas more complicated.
>>> >> >>>
>>> >> >>>> consider exp(sum(log(numeric(0 ... ?)
>>> >> >> That's a fairly standard mathematical convention,
>>> which
>>> >> is presumably why sum and prod work that way.
>>> >> >> Duncan Murdoch
>>>
>>> Ben>OK.  I guess I was expecting NaN/NA (as opposed to
>>> Ben> an error), but I take the "this makes everything else
>>> Ben> more complicated" point.  Should this be documented or
>>> Ben> is it just too obvious ... ?  (Funny -- I'm willing to
>>> Ben> take gamma(1)==1 without any argument or suggestion
>>> Ben> that it should be documented ...)
>>>
>>> see?  so it looks to me as if you have finally convinced
>>> yourself that '1' is the most reasonable result.. ;-)
>>>
>>> Anyway, I've added a sentence to help(prod)  {which matches
>>> the sentence in help(sum), BTW}.
>>>
>>> Martin
>>>
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] prod(numeric(0)) surprise

2006-01-09 Thread Martin Morgan

Duncan Murdoch <[EMAIL PROTECTED]> writes:

> On 1/9/2006 1:27 PM, Liaw, Andy wrote:
>> If you haven't seen this in your math courses, perhaps this would help:
>> http://en.wikipedia.org/wiki/Empty_set
>>
>
> This is what is so great about Wikipedia:  it gives certainty where
> I'd only call it a fairly standard convention.  ;-)
>
> Duncan Murdoch

Yes, thanks for the refresher and sorry for the noise. Martin

>> which says, in part:
>> Operations on the empty set
>> Operations performed on the empty set (as a set of things to be
>> operated
>> upon) can also be confusing. (Such operations are nullary operations.) For
>> example, the sum of the elements of the empty set is zero, but the product
>> of the elements of the empty set is one (see empty product). This may seem
>> odd, since there are no elements of the empty set, so how could it matter
>> whether they are added or multiplied (since "they" do not exist)?
>> Ultimately, the results of these operations say more about the operation in
>> question than about the empty set. For instance, notice that zero is the
>> identity element for addition, and one is the identity element for
>> multiplication.
>> Andy
>> From: Martin Morgan
>>> I guess I have to say yes, I'd exepct
>>> x <- 1:10
>>> sum(x[x>10]) ==> numeric(0)
>>> this would be reinforced by recongnizing that numeric(0) is not
>>> zero,
>>> but nothing. I guess the summation over an empty set is an empty set,
>>> rather than a set containing the number 0. Certainly these
>>> exp(x[x>10]) ==> numeric(0)
>>> numeric(0) + 1 ==> numeric(0)
>>> would give me pause.
>>> Gabor Grothendieck <[EMAIL PROTECTED]> writes:
>>> > The way to think about it is:
>>> >
>>> >prod(rep(x,n)) == x^n
>>> >
>>> > and that works for n=0 too.
>>> Hmm, Not sure what to put in for x and n? do you mean x ==
>>> numeric(0),
>>> n == 0 (0 copies of an empty set), x == ANY n == numeric(0) (an empty
>>> set of ANYthing), x == numeric(0), n == numeric(0) ? For all of these,
>>> x^n evaluates to numeric(0).
>>> Martin (Morgan)
>>> Duncan Murdoch <[EMAIL PROTECTED]> writes:
>>> > On 1/9/2006 12:40 PM, Martin Morgan wrote:
>>> >> I'm a little confused. I understand that numeric(0) means an empty
>>> >> numeric vector, not the number 0 expressed as numeric. As it is
>>> now,
>>> >> prod(numeric(0)) generates something -- a vector of length 1
>>> >> containing the number 1 -- from nothing. I would have expected
>>> >> prod(numeric(0)) ==> numeric(0)
>>> >> this is consistent with
>>> >> numeric(0) ==> numeric(0)
>>> >> numeric(0) * 1 ==> numeric(0)
>>> >> cumprod(numeric(0)) ==> numeric(0)
>>> >> and, because concatenation occus before function evaluation,
>>> >> prod(c(numeric(0),1)) ==> prod( c(1) ) ==> 1
>>> >> I would expect sum() to behave the same way, e.g., sum(numeric(0))
>>> >> ==>
>>> >> numeric(0). From below,
>>> >>
>>> >
>>> > I think the code below works as I'd expect.  Would you really
>>> like the
>>> > last answer to be numeric(0)?
>>> >
>>> >  > x <- 1:10
>>> >  > sum(x)
>>> > [1] 55
>>> >  > sum(x[x>5])
>>> > [1] 40
>>> >  > sum(x[x>10])
>>> > [1] 0
>>> >
>>> > Duncan Murdoch
>>> >
>>> >>> >>>> consider exp(sum(log(numeric(0 ... ?)
>>> >>> >> >> That's a fairly standard mathematical convention,
>>> >>> which
>>> >>> >> is presumably why sum and prod work that way.
>>> >>> >> >> Duncan Murdoch
>>> >> I would have expected numeric(0) as the result (numeric(0) is the
>>> >> result from log(numeric(0)), etc).
>>> >> Martin (Morgan)
>>> >> Martin Maechler <[EMAIL PROTECTED]> writes:
>>> >>
>>> >>>>>>>> "Ben" == Ben Bolker <[EMAIL PROTECTED]>
>>> >>>>>>>> on Sun, 08 Jan 2006 21:40:05 -0500 writes:
>>> >>>
>>> >>> Ben> Duncan Murdoch wrote:
>>> >>> >>

Re: [Rd] How to address the following: CRAN packages not using Suggests conditionally

2018-01-22 Thread Martin Morgan


On 01/22/2018 08:40 AM, Ulrich Bodenhofer wrote:
Thanks a lot, Iñaki, this is a perfect solution! I already implemented 
it and it works great. I'll wait for 2 more days before I submit the 
revised package to CRAN - in order to give others to comment on it.


It's very easy for 'pictures of code' (unevaluated code chunks in 
vignettes) to drift from the actual implementation. So I'd really 
encourage your conditional evaluation to be as narrow as possible -- 
during CRAN or even CRAN fedora checks. Certainly trying to use 
uninstalled Suggest'ed packages in vignettes should provide an error 
message that is informative to users. Presumably the developer or user 
intends actually to execute the code, and needs to struggle through 
whatever issues come up. I'm not sure whether my comments are consistent 
with Writing R Extensions or not.


There is a fundamental tension between the CRAN and Bioconductor release 
models. The Bioconductor 'devel' package repositories and nightly builds 
are meant to be a place where new features and breaking changes can be 
introduced and problems resolved before being exposed to general users 
as a stable 'release' branch, once every six months. This means that the 
Bioconductor devel branch periodically (as recently and I suspect over 
the next several days) contains considerable carnage that propagates to 
CRAN devel builds, creating additional work for CRAN maintainers.


Martin Morgan
Bioconductor



Best regards,
Ulrich


On 01/22/2018 10:16 AM, Iñaki Úcar wrote:
Re-sending, since I forgot to include the list, sorry. I'm including 
r-package-devel too this time, as it seems more appropriate for this 
list.



El 22 ene. 2018 10:11, "Iñaki Úcar" <mailto:i.uca...@gmail.com>> escribió:




    El 22 ene. 2018 8:12, "Ulrich Bodenhofer"
    mailto:bodenho...@bioinf.jku.at>> 
escribió:


    Dear colleagues, dear members of the R Core Team,

    This was an issue raised by Prof. Brian Ripley and sent
    privately to all developers of CRAN packages that suggest
    Bioconductor packages (see original message below). As
    mentioned in my message enclosed below, it was easy for me to
    fix the error in examples (new version not submitted to CRAN
    yet), but it might turn into a major effort for the warnings
    raised by the package vignette. Since I have not gotten any
    advice yet, I take the liberty to post it here on this list -
    hoping that we reach a conclusion here how to deal with this
    matter.


    Just disable code chunk evaluation if suggested packages are
    missing (see [1]). As explained by Prof. Ripley, it will only
    affect Fedora checks on r-devel, i.e., your users will still see
    fully evaluated vignettes on CRAN.

    [1] https://www.enchufa2.es/archives/suggests-and-vignettes.html
    <https://www.enchufa2.es/archives/suggests-and-vignettes.html>

    Iñaki


    Thanks in advance for your kind assistance,
    Ulrich Bodenhofer



     Forwarded Message 
    Subject:        Re: CRAN packages not using Suggests 
conditionally

    Date:   Mon, 15 Jan 2018 08:44:40 +0100
    From:   Ulrich Bodenhofer mailto:bodenho...@bioinf.jku.at>>
    To:     Prof Brian Ripley mailto:rip...@stats.ox.ac.uk>>
    CC:     [...stripped for the sake of privacy ...]



    Dear Prof. Ripley,

    Thank you very much for bringing this important issue to my
    attention. I
    am the maintainer of the 'apcluster' package. My package 
refers to

    'Biostrings' in an example section of a help page (a quite
    insignificant
    one, by the way), which creates errors on some platforms. It
    also refers
    to 'kebabs' in the package vignette, which leads to warnings.

    I could fix the first, more severe, problem quite easily, (1)
    since it
    is relatively easy to wrap an entire examples section in a
    conditional,
    and (2), as I have mentioned, it is not a particularly
    important help page.

    Regarding the vignette, I want to ask for your advice now,
    since the
    situation appears more complicated to me. While it is, of
    course, only
    one code chunk that loads the 'kebabs' package, five more code
    chunks
    depend on the package (more specifically, the data objects
    created by a
    method implemented in the package) - with quite some text in
    between. So
    the handling of the conditional loading of the package would
    propagate
    to multiple code chunks and also affect the validity of the
    explanations
    in between. I would see the following options:

    1. Remove the entire section of the vignette. That would be a
    pity,
    sin

Re: [Rd] Why R should never move to git

2018-01-25 Thread Martin Morgan


On 01/25/2018 07:09 AM, Duncan Murdoch wrote:

On 25/01/2018 6:49 AM, Dirk Eddelbuettel wrote:


On 25 January 2018 at 06:20, Duncan Murdoch wrote:
| On 25/01/2018 2:57 AM, Iñaki Úcar wrote:
| > For what it's worth, this is my workflow:
| >
| > 1. Get a fork.
| > 2. From the master branch, create a new branch called 
fix-[something].

| > 3. Put together the stuff there, commit, push and open a PR.
| > 4. Checkout master and repeat from 2 to submit another patch.
| >
| > Sometimes, I forget the step of creating the new branch and I put my
| > fix on top of the master branch, which complicates things a bit. But
| > you can always rename your fork's master and pull it again from
| > upstream.
|
| I saw no way to follow your renaming suggestion.  Can you tell me the
| steps it would take?  Remember, there's already a PR from the master
| branch on my fork.  (This is for future reference; I already followed
| Gabor's more complicated instructions and have solved the immediate
| problem.)

1)  Via GUI: fork or clone at github so that you have URL to use in 2)


Github would not allow me to fork, because I already had a fork of the 
same repository.  I suppose I could have set up a new user and done it.


I don't know if cloning the original would have made a difference. I 
don't have permission to commit to the original, and the 
manipulateWidget maintainers wouldn't be able to see my private clone, 
so I don't see how I could create a PR that they could use.


Once again, let me repeat:  this should be an easy thing to do.  So far 
I'm pretty convinced that it's actually impossible to do it on the 
Github website without hacks like creating a new user.  It's not trivial 
but not that difficult for a git expert using command line git.


If R Core chose to switch the R sources to use git and used Github to 
host a copy, problems like mine would come up fairly regularly.  I don't 
think R Core would gain enough from the switch to compensate for the 
burden of dealing with these problems.


A different starting point gives R-core members write access to the 
R-core git, which is analogous to the current svn setup. A restricted 
set of commands are needed, mimicking svn


  git clone ...   # svn co
  git pull# svn up
  [...; git commit ...]
  git push ...# svn ci

Probably this would mature quickly into a better practice where new 
features / bug fixes are developed on a local branch.


A subset of R-core might participate in managing pull requests on a 
'read only' Github mirror. Incorporating mature patches would involve 
git, rather than the Github GUI. In one's local repository, create a new 
branch and pull from the repository making the request


  git checkout -b a-pull-request master
  git pull https://github.com/a-user/their.git their-branch

Check and modify, then merge locally and push to the R-core git

  ## identify standard / best practice for merging branches
  git checkout master
  git merge ... a-pull-request
  git push ...

Creating pull requests is a problem for the developer wanting to 
contribute to R, not for the R-core developer. As we've seen in this 
thread, R-core would not need to feel responsible for helping developers 
create pull requests.


Martin Morgan



Maybe Gitlab or some other front end would be better.

Duncan Murdoch



2)  Run
   git clone giturl
 to fetch local instance
3)  Run
   git checkout -b feature/new_thing_a
 (this is 2. above by Inaki)
4)  Edit, save, compile, test, revise, ... leading to 1 or more commits

5)  Run
   git push origin
 standard configuration should have remote branch follow local 
branch, I

 think the "long form" is
   git push --set-upstream origin feature/new_thing_a

6)  Run
   git checkout -
 or
   git checkout master
 and you are back in master. Now you can restart at my 3) above for
 branches b, c, d and create independent pull requests

I find it really to have a bash prompt that shows the branch:

 edd@rob:~$ cd git/rcpp
 edd@rob:~/git/rcpp(master)$ git checkout -b 
feature/new_branch_to_show

 Switched to a new branch 'feature/new_branch_to_show'
 edd@rob:~/git/rcpp(feature/new_branch_to_show)$ git checkout -
 Switched to branch 'master'
 Your branch is up-to-date with 'origin/master'.
 edd@rob:~/git/rcpp(master)$ git branch -d feature/new_branch_to_show
 Deleted branch feature/new_branch_to_show (was 5b25fe62).
 edd@rob:~/git/rcpp(master)$

There are few tutorials out there about how to do it, I once got mine 
from
Karthik when we did a Software Carpentry workshop.  Happy to detail 
off-list,

it adds less than 10 lines to ~/.bashrc.

Dirk

|
| Duncan Murdoch
|
| > Iñaki
| >
| >
| >
| > 2018-01-25 0:17 GMT+01:00 Duncan Murdoch :
| >> Lately I've been doing so

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-03 Thread Martin Morgan




On 05/02/2018 03:21 PM, Joris Meys wrote:

Dear all,

I've noticed by trying to download gz files from here :
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM907811

At the bottom one can download GSM907811.CEL.gz . If I download this
manually and try

oligo::read.celfiles("GSM907811.CEL.gz")

everything works fine. (oligo is a bioConductor package)

However, if I download using

download.file("
https://www.ncbi.nlm.nih.gov/geo/download/?acc=GSM907811&format=file&file=GSM907811%2ECEL%2Egz
",
   destfile = "GSM907811.CEL.gz")


On windows, the 'mode' argument to download.file() needs to be "wb" 
(write binary) for binary files.


Martin



The file is downloaded, but oligo::read.celfiles() returns the following
error:

Error in checkChipTypes(filenames, verbose, "affymetrix", TRUE) :
   End of gz file reached unexpectedly. Perhaps this file is truncated.

Moreover, if I try to delete it after using download.file(), I get a
warning that permission is denied. I can only remove it using Windows file
explorer after I closed the R session, indicating that the connection is
still open. Yet, showConnections() doesn't show any open connections either.

Session info below. Note that I started from a completely fresh R session.
oligo is needed due to the specific file format of these gz files. They're
not standard tarred files.

Cheers
Joris

Session Info
-

R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252
LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats4parallel  stats graphics  grDevices utils datasets
methods
[9] base

other attached packages:
  [1] pd.hugene.1.0.st.v1_3.14.1 DBI_0.8
oligo_1.44.0
  [4] Biobase_2.39.2 oligoClasses_1.42.0
RSQLite_2.1.0
  [7] Biostrings_2.48.0  XVector_0.19.9
IRanges_2.13.28
[10] S4Vectors_0.17.42  BiocGenerics_0.25.3

loaded via a namespace (and not attached):
  [1] Rcpp_0.12.16compiler_3.5.0
  [3] BiocInstaller_1.30.0GenomeInfoDb_1.15.5
  [5] bitops_1.0-6iterators_1.0.9
  [7] tools_3.5.0 zlibbioc_1.25.0
  [9] digest_0.6.15   bit_1.1-12
[11] memoise_1.1.0   preprocessCore_1.41.0
[13] lattice_0.20-35 ff_2.2-13
[15] pkgconfig_2.0.1 Matrix_1.2-14
[17] foreach_1.4.4   DelayedArray_0.5.31
[19] yaml_2.1.18 GenomeInfoDbData_1.1.0
[21] affxparser_1.52.0   bit64_0.9-7
[23] grid_3.5.0  BiocParallel_1.13.3
[25] blob_1.1.1  codetools_0.2-15
[27] matrixStats_0.53.1  GenomicRanges_1.31.23
[29] splines_3.5.0   SummarizedExperiment_1.9.17
[31] RCurl_1.95-4.10 affyio_1.49.2





This email message may contain legally privileged and/or...{{dropped:2}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] download.file does not process gz files correctly (truncates them?)

2018-05-03 Thread Martin Morgan



On 05/03/2018 05:48 AM, Joris Meys wrote:

Dear all,

I've been diving a bit deeper into this per request of Tomas Kalibra, and
found the following :

- the lock on the file is only after trying to read it using oligo, so
that's not a R problem in itself. The problem is independent of extrenal
packages.

- using Windows' fc utility and cygwin's cmp utility I found out that every
so often the download.file() function inserts an extra byte. There's no
real obvious pattern in how these bytes are added, but the file downloaded
using download.file() is actually larger (in this case by about 8 kb). The
file xxx_inR.CEL.gz is read in using:


I believe the difference in mode = "w" vs "wb", and the reason this is 
restricted to Windows downloads, is due to the difference in text file 
line endings, where with mode="w", download.file (and many other 
utilities outside R) recognize the "foo\n" as "foo\r\n". Obviously this 
messes up binary files.


I guess in the CEL.gz file there are about 8k "\n" characters.

Henrik's suggestion (default = "wb") would introduce the complementary 
problem -- text files would have incorrect line endings.


Martin





setwd("E:/Temp/genexpr/Compare")
id <- "GSM907854"
flink <- paste0("
https://www.ncbi.nlm.nih.gov/geo/download/?acc=GSM907854&format=file&file=GSM907854%2ECEL%2Egz
")
fname <- paste0(id,"_inR.CEL.gz")
download.file(flink,
   destfile = fname)

The file xxx_direct.CEL.gz is downloaded from
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM907854 (download link
at the bottom of the page).

Output of dir in CMD:

05/03/2018  11:02 AM 4,529,547 GSM907854_direct.CEL.gz
05/03/2018  11:17 AM 4,537,668 GSM907854_inR.CEL.gz

or from R :


diff(file.size(dir())) # contains both CEL files.

[1] 8121

Strangely enough I get the following message from download.file() :

Content type 'application/octet-stream' length 4529547 bytes (4.3 MB)
downloaded 4.3 MB

So the reported length is exactly the same as if I would download the file
directly, but the file on disk itself is larger. So it seems
download.file() is adding bytes when saving the data on disk.  This
behaviour is independent of antivirus and/or firewalls turned on or off.

Also keep in mind that these are NOT standard gzipped files. These files
are a specific format for Affymetrix Human Gene 1.0 ST Arrays.

If I need to run other tests, please let me know.
Kind regards

Joris

On Wed, May 2, 2018 at 9:21 PM, Joris Meys  wrote:


Dear all,

I've noticed by trying to download gz files from here :
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM907811

At the bottom one can download GSM907811.CEL.gz . If I download this
manually and try

oligo::read.celfiles("GSM907811.CEL.gz")

everything works fine. (oligo is a bioConductor package)

However, if I download using

download.file("https://www.ncbi.nlm.nih.gov/geo/download/
?acc=GSM907811&format=file&file=GSM907811%2ECEL%2Egz",
   destfile = "GSM907811.CEL.gz")

The file is downloaded, but oligo::read.celfiles() returns the following
error:

Error in checkChipTypes(filenames, verbose, "affymetrix", TRUE) :
   End of gz file reached unexpectedly. Perhaps this file is truncated.

Moreover, if I try to delete it after using download.file(), I get a
warning that permission is denied. I can only remove it using Windows file
explorer after I closed the R session, indicating that the connection is
still open. Yet, showConnections() doesn't show any open connections either.

Session info below. Note that I started from a completely fresh R session.
oligo is needed due to the specific file format of these gz files. They're
not standard tarred files.

Cheers
Joris

Session Info

-

R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C

[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats4parallel  stats graphics  grDevices utils datasets
methods
[9] base

other attached packages:
  [1] pd.hugene.1.0.st.v1_3.14.1 DBI_0.8
oligo_1.44.0
  [4] Biobase_2.39.2 oligoClasses_1.42.0
RSQLite_2.1.0
  [7] Biostrings_2.48.0  XVector_0.19.9
IRanges_2.13.28
[10] S4Vectors_0.17.42  BiocGenerics_0.25.3

loaded via a namespace (and not attached):
  [1] Rcpp_0.12.16compiler_3.5.0
  [3] BiocInstaller_1.30.0GenomeInfoDb_1.15.5
  [5] bitops_1.0-6iterators_1.0.9
  [7] tools_3.5.0 zlibbioc_1.25.0
  [9] digest_0.6.15   bit_1.1-12
[11] memoise_1.1.0   preprocessCore_1.41.0
[13] lattice_0.20-35 ff_2.2-13
[15] pkgconfig_2.0.1 Matrix_1.2-14
[17] foreach_1.4.4   DelayedArray_0.5.31

Re: [Rd] length of `...`

2018-05-03 Thread Martin Morgan


nargs() provides the number of arguments without evaluating them

> f = function(x, ..., y) nargs()
> f()
[1] 0
> f(a=1, b=2)
[1] 2
> f(1, a=1, b=2)
[1] 3
> f(x=1, a=1, b=2)
[1] 3
> f(stop())
[1] 1


On 05/03/2018 11:01 AM, William Dunlap via R-devel wrote:

In R-3.5.0 you can use ...length():
   > f <- function(..., n) ...length()
   > f(stop("one"), stop("two"), stop("three"), n=7)
   [1] 3

Prior to that substitute() is the way to go
   > g <- function(..., n) length(substitute(...()))
   > g(stop("one"), stop("two"), stop("three"), n=7)
   [1] 3

R-3.5.0 also has the ...elt(n) function, which returns
the evaluated n'th entry in ... , without evaluating the
other ... entries.
   > fn <- function(..., n) ...elt(n)
   > fn(stop("one"), 3*5, stop("three"), n=2)
   [1] 15

Prior to 3.5.0, eval the appropriate component of the output
of substitute() in the appropriate environment:
   > gn <- function(..., n) {
   +   nthExpr <- substitute(...())[[n]]
   +   eval(nthExpr, envir=parent.frame())
   + }
   > gn(stop("one"), environment(), stop("two"), n=2)
   




Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, May 3, 2018 at 7:29 AM, Dénes Tóth  wrote:


Hi,


In some cases the number of arguments passed as ... must be determined
inside a function, without evaluating the arguments themselves. I use the
following construct:

dotlength <- function(...) length(substitute(expression(...))) - 1L

# Usage (returns 3):
dotlength(1, 4, something = undefined)

How can I define a method for length() which could be called directly on
`...`? Or is it an intention to extend the base length() function to accept
ellipses?


Regards,
Denes

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




This email message may contain legally privileged and/or...{{dropped:2}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Compiler + stopifnot bug

2019-01-03 Thread Martin Morgan

For what it's worth this also introduced

> df = data.frame(v = package_version("1.2"))
> rbind(df, df)$v
 [[1]]
 [1] 1 2

 [[2]]
 [1] 1 2

instead of

> rbind(df, df)$v
[1] '1.2' '1.2'

which shows up in Travis builds of Bioconductor packages

  https://stat.ethz.ch/pipermail/bioc-devel/2019-January/014506.html

and elsewhere

Martin Morgan

On 1/3/19, 7:05 PM, "R-devel on behalf of Duncan Murdoch" 
 wrote:

On 03/01/2019 3:37 p.m., Duncan Murdoch wrote:
> I see this too; by bisection, it seems to have first appeared in r72943.

Sorry, that was a typo.  I meant r75943.

Duncan Murdoch

> 
> Duncan Murdoch
> 
> On 03/01/2019 2:18 p.m., Iñaki Ucar wrote:
>> Hi,
>>
>> I found the following issue in r-devel (2019-01-02 r75945):
>>
>> `foo<-` <- function(x, value) {
>> bar(x) <- value * x
>> x
>> }
>>
>> `bar<-` <- function(x, value) {
>> stopifnot(all(value / x == 1))
>> x + value
>> }
>>
>> `foo<-` <- compiler::cmpfun(`foo<-`)
>> `bar<-` <- compiler::cmpfun(`bar<-`)
>>
>> x <- c(2, 2)
>> foo(x) <- 1
>> x # should be c(4, 4)
>> #> [1] 3 3
>>
>> If the functions are not compiled or the stopifnot call is removed,
>> the snippet works correctly. So it seems that something is messing
>> around with the references to "value" when the call to stopifnot gets
>> compiled, and the wrong "value" is modified. Note also that if "x <-
>> 2", then the result is correct, 4.
>>
>> Regards,
>>
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] defining r audio connections

2020-05-06 Thread Martin Morgan

The public connection API is defined in

https://github.com/wch/r-source/blob/trunk/src/include/R_ext/Connections.h

I'm not sure of a good pedagogic example; people who want to write their own 
connections usually want to do so for complicated reasons!

This is my own abandoned attempt 
https://github.com/mtmorgan/socketeer/blob/b0a1448191fe5f79a3f09d1f939e1e235a22cf11/src/connection.c#L169-L192
 where connection_local_client() is called from R and _connection_local() 
creates and populates the appropriate structure. Probably I have done things 
totally wrong (e.g., by not checking the version of the API, as advised in the 
header file!)

Martin Morgan

On 5/6/20, 2:26 PM, "R-devel on behalf of Duncan Murdoch" 
 wrote:

On 06/05/2020 1:09 p.m., frede...@ofb.net wrote:
> Dear R Devel,
> 
> Since Linux moved away from using a file-system interface for audio, I 
think it is necessary to write special libraries to interface with audio 
hardware from various languages on Linux.
> 
> In R, it seems like the appropriate datatype for a `snd_pcm_t` handle 
pointing to an open ALSA source or sink would be a "connection". Connection 
types are already defined in R for "file", "url", "pipe", "fifo", 
"socketConnection", etc.
> 
> Is there a tutorial or an example package where a new type of connection 
is defined, so that I can see how to do this properly in a package?
> 
> I can see from the R source that, for example, `do_gzfile` is defined in 
`connections.c` and referenced in `names.c`. However, I thought I should ask 
here first in case there is a better place to start, than trying to copy this 
code.
> 
> I only want an object that I can use `readBin` and `writeBin` on, to read 
and write audio data using e.g. `snd_pcm_writei` which is part of the 
`alsa-lib` package.

I don't think R supports user-defined connections, but probably writing 
readBin and writeBin equivalents specific to your library wouldn't be 
any harder than creating a connection.  For those, you will probably 
want to work with an "external pointer" (see Writing R Extensions). 
Rcpp probably has support for these if you're working in C++.

Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] defining r audio connections

2020-05-06 Thread Martin Morgan

yep, you're right, after some initial clean-up and running with or without 
--as-cran R CMD check gives a NOTE

  *  checking compiled code
  File ‘socketeer/libs/socketeer.so’:
Found non-API calls to R: ‘R_GetConnection’,
   ‘R_new_custom_connection’
   
  Compiled code should not call non-API entry points in R.
   
  See 'Writing portable packages' in the 'Writing R Extensions' manual.

Connections in general seem more useful than ad-hoc functions, though perhaps 
for Frederick's use case Duncan's suggestion is sufficient. For non-CRAN 
packages I personally would implement a connection.

(I mistakenly thought this was a more specialized mailing list; I wouldn't have 
posted to R-devel on this topic otherwise)

Martin Morgan

On 5/6/20, 4:12 PM, "Gábor Csárdi"  wrote:

AFAIK that API is not allowed on CRAN. It triggers a NOTE or a
WARNING, and your package will not be published.

    Gabor

On Wed, May 6, 2020 at 9:04 PM Martin Morgan  
wrote:
>
> The public connection API is defined in
>
> https://github.com/wch/r-source/blob/trunk/src/include/R_ext/Connections.h
>
> I'm not sure of a good pedagogic example; people who want to write their 
own connections usually want to do so for complicated reasons!
>
> This is my own abandoned attempt 
https://github.com/mtmorgan/socketeer/blob/b0a1448191fe5f79a3f09d1f939e1e235a22cf11/src/connection.c#L169-L192
 where connection_local_client() is called from R and _connection_local() 
creates and populates the appropriate structure. Probably I have done things 
totally wrong (e.g., by not checking the version of the API, as advised in the 
header file!)
>
> Martin Morgan
>
> On 5/6/20, 2:26 PM, "R-devel on behalf of Duncan Murdoch" 
 wrote:
>
> On 06/05/2020 1:09 p.m., frede...@ofb.net wrote:
> > Dear R Devel,
> >
> > Since Linux moved away from using a file-system interface for 
audio, I think it is necessary to write special libraries to interface with 
audio hardware from various languages on Linux.
> >
> > In R, it seems like the appropriate datatype for a `snd_pcm_t` 
handle pointing to an open ALSA source or sink would be a "connection". 
Connection types are already defined in R for "file", "url", "pipe", "fifo", 
"socketConnection", etc.
> >
> > Is there a tutorial or an example package where a new type of 
connection is defined, so that I can see how to do this properly in a package?
> >
> > I can see from the R source that, for example, `do_gzfile` is 
defined in `connections.c` and referenced in `names.c`. However, I thought I 
should ask here first in case there is a better place to start, than trying to 
copy this code.
> >
> > I only want an object that I can use `readBin` and `writeBin` on, 
to read and write audio data using e.g. `snd_pcm_writei` which is part of the 
`alsa-lib` package.
>
> I don't think R supports user-defined connections, but probably 
writing
> readBin and writeBin equivalents specific to your library wouldn't be
> any harder than creating a connection.  For those, you will probably
> want to work with an "external pointer" (see Writing R Extensions).
> Rcpp probably has support for these if you're working in C++.
>
> Duncan Murdoch
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] format: different S4 behavior in a package

2010-06-23 Thread Martin Morgan

On 06/23/2010 12:16 AM, Daniel Murphy wrote:
> R-Devel-ers:
> 
> I have an S4 method that simply formats an object:
> 
> setGeneric("formatMe", function(x) standardGeneric("formatMe"))
> setMethod("formatMe", "ANY", function(x) format(x))
> 
> If I issue the above in an R session, then define an S4 class with its own
> format method, I get the desired result:
> 
>> setClass("A",contains="numeric")
> [1] "A"
>> setMethod("format","A", function(x, ...) "Hey Jude")
> Creating a new generic function for "format" in ".GlobalEnv"
> [1] "format"
>> a<-new("A",1968)
>> formatMe(a)
> [1] "Hey Jude"
> 
> 
> However, if I put the two "formatMe" definitions into a package ("Test"), I
> do not get the desired result.
> 
> 
>> library(Test)
>> setClass("A",contains="numeric")
> [1] "A"
>> setMethod("format","A", function(x, ...) "Hey Jude")
> Creating a new generic function for "format" in ".GlobalEnv"

This is the clue -- you're creating a new S4 generic, so there's a
base::format, and a .GlobalEnv::format. Test::formatMe respects its name
space, and sees base::format.

In the S3 case, base::format is already an S3 generic, and you're just
adding a method, so there's only base::format for everyone to find.

In Test, you could setGeneric(format) and then export(format). It might
also be enough to just export(format); I'm not sure.

Martin

> [1] "format"
>> a<-new("A",1968)
>> formatMe(a)
> [1] "1968"
> 
> 
> The "disconnect" does not occur, however, if the S4 format method is an S3
> incarnation:
> 
>> setClass("B",contains="numeric",S3methods=TRUE)
> [1] "B"
>> format.B <- function(x, ...) "Don't make it bad"
>> b<-new("B",1968)
>> formatMe(b)
> [1] "Don't make it bad"
> 
> Could the problem be in Test's NAMESPACE file? There is only one line:
> exportMethods(formatMe)
> 
> Here is Test's DESCRIPTION file:
> Package: Test
> Type: Package
> Title: Testing format
> Version: 1.0
> Date: 2010-06-22
> Author: Dan Murphy
> Maintainer: Dan Murphy 
> Depends: methods
> Description: Does format in a package work with S4 format method?
> License: GPL (>= 2)
> LazyLoad: yes
> 
> (I would send the Help file, but I don't think that is the problem.)
> 
> I am using version 2.11.1 on a Windows Vista machine.
> 
> Any guidance would be appreciated. Thank you
> 
> Dan Murphy
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] how to define method for "+" function in a new class

2010-07-07 Thread Martin Morgan

On 07/07/2010 08:09 AM, james.fo...@diamond.ac.uk wrote:
> 
> 
> Dear R developers,
> I have a new class, which I called "Molecule", and have tried to define =
> a "+" operation for 2 objects of this class.
> This is what I have written so far, although the method is not complete =
> (I'm trying to look at it at intermediate stages):
> 
> setMethod(
>   f=3D"+",
>   signature(x=3D"Molecule",y=3D"Molecule"),
>   definition=3Dfunction(x,y,...)
>  {
>   # Check both objects are correct
>   checkMolecule(x)
>   checkMolecule(y)
> 
>   # Extract chains information
>   ch1 <- getMoleculeChains(x)
>   ch2 <- getMoleculeChains(y)
>   union_ch <- unique(c(ch1,ch2))
>   not_used <- .ALPHABET
>   for (i in seq(along =3D union_ch))
>   {
>idx <- which(not_used !=3D union_ch[i])
>if (length(idx) !=3D 0) not_used <- not_used[idx]
>   }
> 
>   return(not_used)
>  }
>  )
> 
> 
> The definition of class Molecule is included earlier in the same file. =
> When I source it, I get the following error message:
> 
> Error in match.call(fun, fcall) :=20
>   unused argument(s) (x =3D "Molecule", y =3D "Molecule")

If I

> getGeneric("+")
standardGeneric for "+" defined from package "base"
  belonging to group(s): Arith

function (e1, e2)
standardGeneric("+", .Primitive("+"))

Methods may be defined for arguments: e1, e2
Use  showMethods("+")  for currently available ones.

I see that the generic is defined to take two arguments e1 and e2. So

setMethod("+", c("Molecule", "Molecule"), function(e1, e2) {
## ...
})

but actually here it might often pay to discover ?GroupGenericFunctions
and end up with something like


setClass("A", representation=representation(x="numeric"))

setMethod("Arith", c("A", "A"), function(e1, e2) {
   new(class(e1), x=callGeneric(e1...@x, e2...@x))
})

and then

> new("A", x=1:5) + new("A", x=5:1)
An object of class "A"
Slot "x":
[1] 6 6 6 6 6

but also

> new("A", x=1:5) * new("A", x=5:1)
An object of class "A"
Slot "x":
[1] 5 8 9 8 5

Martin

> 
> 
> I can't see what's wrong in my method definition. Can anyone help me =
> with this?
> 
> J
> 
> Dr James Foadi PhD
> Membrane Protein Laboratory (MPL)
> Diamond Light Source Ltd
> Diamond House
> Harewell Science and Innovation Campus
> Chilton, Didcot
> Oxfordshire OX11 0DE
> 
> Email:  james.fo...@diamond.ac.uk
> Alt Email:  j.fo...@imperial.ac.uk
> 


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Telling Windows how to find DLL's from R?

2010-07-09 Thread Martin Morgan

On 07/09/2010 11:38 AM, Dominick Samperi wrote:
> Is it possible to set Windows' search path from within R, or
> to tell Windows how to find a DLL in some other way from
> R? Specifically, if a package DLL depends on another DLL
> the normal requirement is that the second DLL be in the
> search path so Windows can find it (there are other tricks,
> but they apply at the Windows level, not at the R level).

This thread

  https://stat.ethz.ch/pipermail/r-devel/2008-January/047961.html

might be relevant, especially the DLLpath argument to dyn.load.

Martin

> 
> Thanks,
> Dominick
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Severe memory problem using split()

2010-07-12 Thread Martin Morgan

On 07/12/2010 01:45 PM, cstrato wrote:
> Dear all,
> 
> With great interest I followed the discussion:
> https://stat.ethz.ch/pipermail/r-devel/2010-July/057901.html
> since I have currently a similar problem:
> 
> In a new R session (using xterm) I am importing a simple table
> "Hu6800_ann.txt" which has a size of 754KB only:
> 
>> ann <- read.delim("Hu6800_ann.txt")
>> dim(ann)
> [1] 7129   11
> 
> 
> When I call "object.size(ann)" the estimated memory used to store "ann"
> is already 2MB:
> 
>> object.size(ann)
> 2034784 bytes
> 
> 
> Now I call "split()" and check the estimated memory used which turns out
> to be 3.3GB:
> 
>> u2p  <- split(ann[,"ProbesetID"],ann[,"UNIT_ID"])
>> object.size(u2p)
> 3323768120 bytes

I guess things improve with stringsAsFactors=FALSE in read.delim?

Martin

> 
> During the R session I am running "top" in another xterm and can see
> that the memory usage of R increases to about 550MB RSIZE.
> 
> 
> Now I do:
> 
>> object.size(unlist(u2p))
> 894056 bytes
> 
> It takes about 3 minutes to complete this call and the memory usage of R
> increases to about 1.3GB RSIZE. Furthermore, during evaluation of this
> function the free RAM of my Mac decreases to less than 8MB free PhysMem,
> until it needs to swap memory. When finished, free PhysMem is 734MB but
> the size of R increased to 577MB RSIZE.
> 
> Doing "split(ann[,"ProbesetID"],ann[,"UNIT_ID"],drop=TRUE)" did not
> change the object.size, only processing was faster and it did use less
> memory on my Mac.
> 
> Do you have any idea what the reason for this behavior is?
> Why is the size of list "u2p" so large?
> Do I make any mistake?
> 
> 
> Here is my sessionInfo on a MacBook Pro with 2GB RAM:
> 
>> sessionInfo()
> R version 2.11.1 (2010-05-31)
> x86_64-apple-darwin9.8.0
> 
> locale:
> [1] C
> 
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
> 
> Best regards
> Christian
> _._._._._._._._._._._._._._._._._._
> C.h.r.i.s.t.i.a.n   S.t.r.a.t.o.w.a
> V.i.e.n.n.a   A.u.s.t.r.i.a
> e.m.a.i.l:cstrato at aon.at
> _._._._._._._._._._._._._._._._._._
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Severe memory problem using split()

2010-07-12 Thread Martin Morgan

On 07/12/2010 03:00 PM, cstrato wrote:
> Dear Martin,
> 
> Thank you, you are right, now I get:
> 
>> ann <- read.delim("Hu6800_ann.txt", stringsAsFactors=FALSE)
>> object.size(ann)
> 2035952 bytes
>> u2p  <- split(ann[,"ProbesetID"],ann[,"UNIT_ID"])
>> object.size(u2p)
> 1207368 bytes
>> object.size(unlist(u2p))
> 865176 bytes
> 
> Nevertheless, a size of 1.2MB for a list representing 2 of 11 columns of

but it's a list of length(unique(ann[["UNIT_ID"]]))) elements, each of
which has a pointer to the element, a pointer to the name of the
element, and the element data itself. I'd guess it adds up in a
non-mysterious way. For a sense of it (and maybe only understandable if
you have a working understanding of how R represents data) see, e.g.,

> .Internal(inspect(list(x=1,y=2)))
@1a4c538 19 VECSXP g0c2 [ATT] (len=2, tl=0)
  @191cad8 14 REALSXP g0c1 [] (len=1, tl=0) 1
  @191caa8 14 REALSXP g0c1 [] (len=1, tl=0) 2
ATTRIB:
  @16fc8d8 02 LISTSXP g0c0 []
TAG: @60cf18 01 SYMSXP g0c0 [MARK,NAM(2),gp=0x4000] "names"
@1a4c500 16 STRSXP g0c2 [] (len=2, tl=0)
  @674e88 09 CHARSXP g0c1 [MARK,gp=0x21] "x"
  @728c38 09 CHARSXP g0c1 [MARK,gp=0x21] "y"

Martin

> a table of size 754KB seems still to be pretty large?
> 
> Best regards
> Christian
> 
> 
> On 7/12/10 11:44 PM, Martin Morgan wrote:
>> On 07/12/2010 01:45 PM, cstrato wrote:
>>> Dear all,
>>>
>>> With great interest I followed the discussion:
>>> https://stat.ethz.ch/pipermail/r-devel/2010-July/057901.html
>>> since I have currently a similar problem:
>>>
>>> In a new R session (using xterm) I am importing a simple table
>>> "Hu6800_ann.txt" which has a size of 754KB only:
>>>
>>>> ann<- read.delim("Hu6800_ann.txt")
>>>> dim(ann)
>>> [1] 7129   11
>>>
>>>
>>> When I call "object.size(ann)" the estimated memory used to store "ann"
>>> is already 2MB:
>>>
>>>> object.size(ann)
>>> 2034784 bytes
>>>
>>>
>>> Now I call "split()" and check the estimated memory used which turns out
>>> to be 3.3GB:
>>>
>>>> u2p<- split(ann[,"ProbesetID"],ann[,"UNIT_ID"])
>>>> object.size(u2p)
>>> 3323768120 bytes
>>
>> I guess things improve with stringsAsFactors=FALSE in read.delim?
>>
>> Martin
>>
>>>
>>> During the R session I am running "top" in another xterm and can see
>>> that the memory usage of R increases to about 550MB RSIZE.
>>>
>>>
>>> Now I do:
>>>
>>>> object.size(unlist(u2p))
>>> 894056 bytes
>>>
>>> It takes about 3 minutes to complete this call and the memory usage of R
>>> increases to about 1.3GB RSIZE. Furthermore, during evaluation of this
>>> function the free RAM of my Mac decreases to less than 8MB free PhysMem,
>>> until it needs to swap memory. When finished, free PhysMem is 734MB but
>>> the size of R increased to 577MB RSIZE.
>>>
>>> Doing "split(ann[,"ProbesetID"],ann[,"UNIT_ID"],drop=TRUE)" did not
>>> change the object.size, only processing was faster and it did use less
>>> memory on my Mac.
>>>
>>> Do you have any idea what the reason for this behavior is?
>>> Why is the size of list "u2p" so large?
>>> Do I make any mistake?
>>>
>>>
>>> Here is my sessionInfo on a MacBook Pro with 2GB RAM:
>>>
>>>> sessionInfo()
>>> R version 2.11.1 (2010-05-31)
>>> x86_64-apple-darwin9.8.0
>>>
>>> locale:
>>> [1] C
>>>
>>> attached base packages:
>>> [1] stats graphics  grDevices utils datasets  methods   base
>>>
>>> Best regards
>>> Christian
>>> _._._._._._._._._._._._._._._._._._
>>> C.h.r.i.s.t.i.a.n   S.t.r.a.t.o.w.a
>>> V.i.e.n.n.a   A.u.s.t.r.i.a
>>> e.m.a.i.l:cstrato at aon.at
>>> _._._._._._._._._._._._._._._._._._
>>>
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Precompiled vignette on CRAN

2010-07-14 Thread Martin Morgan

On 07/14/2010 01:04 PM, Prof Brian Ripley wrote:
> On Wed, 14 Jul 2010, Felix Schönbrodt wrote:
> 
>> Hello,
>>
>> my package passes R CMD check without any warnings on my local machine
>> (Mac OS), as well as on Uwe Ligges' Winbuilder. On RForge, however, we
>> sometimes run into problems building the Sweave vignettes.
> 
> Just 'problems' is not helpful.
> 
>> Now here's my question: is it necessary for a CRAN submission that the
>> Sweave vignettes can be compiled on CRAN, or is it possible to provide
>> the (locally compiled) pdf vignette to be included in the package?
> 
> This really is a question to ask the CRAN gatekeepers, but people are on
> vacation right now, so I've give some indication of my understanding.
> 
> What does 'compiled' mean here?  (Run through LaTeX?  Run the R code?)
> There are examples on CRAN of packages which cannot re-make their
> vignettes without external files (e.g. LaTeX style files), or take hours
> (literally) to run the code.  The source package should contain the PDF
> versions of the vignettes as made by the author.
> 
> There is relevant advice in 'Writing R Extensions'.

While on this topic, how are non-Sweave files to be made accessible to
browseVignettes() and help(package=...)?

At one point 00Index.dcf files could be used to influence index
creation, and Writing R Extensions indicates that an inst/doc/index.html
file can also be used. The 00Index.dcf approach seems better (no need
for the user to track the appropriate html structure of R's help pages),
but regardless it seems 00Index.dcf is ignored (stronger than 'no longer
necessary' in Writing R Extensions) and index.html does not influence
what is displayed by help(package='...') or browseVignettes().

I'm basing this on the 'limma' package (which seems to have tried to be
a good citizen, with 00Index.dcf and index.html files) and running R in
a console with

> getOption('help_type')
NULL
> sessionInfo()
R version 2.12.0 Under development (unstable) (2010-07-14 r52526)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] limma_3.5.12

Martin


> 
> What the people who do the CRAN package checks do get unhappy about are
> packages which fail running the R code in their vignettes, since this
> often indicates a problem in the package which is not exercised by the
> examples nor tests.  This gives a warning, as you will see in quite a
> few CRAN package checks.
> 
> 
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Very slow subsetting by name

2010-07-15 Thread Martin Morgan

On 07/15/2010 01:12 AM, Hervé Pagès wrote:
> Hi,
> 
> I'm subsetting a named vector using character indices.
> My vector of indices (or keys) is 10x longer than the vector
> I'm subsetting. All my keys are distinct and only 10% of them
> are valid (i.e. match a name of the vector being subsetted).
> It is surprisingly slow:
> 
> x1 <- 1:1000
> names(x1) <- paste("a", x1, sep="")
> keys <- sample(c(names(x1), paste("b", 1:9000, sep="")))
>> system.time(y1 <- x1[keys])
>user  system elapsed
>   0.410   0.000   0.416
> 
> x2 <- 1:2000
> names(x2) <- paste("a", x2, sep="")
> keys <- sample(c(names(x2), paste("b", 1:18000, sep="")))
>> system.time(y2 <- x2[keys])
>user  system elapsed
>   1.730   0.000   1.736

For what its worth, I think this comes about in the loop starting at
subscript.c:538, which seems to be there to allow [<-,*,character to
extend a vector with new named elements

> x=c(a=1)
> x["b"] = 2
> x
a b
1 2

It seems to be irrelevant (?) for sub-setting per se (though by analogy
one might expect x["c"] to return a length-1 vector NA with name "c",
whereas it returns a vector with names NA).

Seems like the O(n^2) loop through NonNullStringMatch could be replaced
by look-ups into a hash, or an additional argument could be propagated
to stringSubscript to exit early when names aren't required. Or the call
to makeSubscript at subset.c:164 could instead be made to matchE in
unique.c.

Martin

> 
> x3 <- 1:4000
> names(x3) <- paste("a", x3, sep="")
> keys <- sample(c(names(x3), paste("b", 1:36000, sep="")))
>> system.time(y3 <- x3[keys])
>user  system elapsed
>   8.900   0.010   9.227
> 
> x4 <- 1:8000
> names(x4) <- paste("a", x4, sep="")
> keys <- sample(c(names(x4), paste("b", 1:72000, sep="")))
>> system.time(y4 <- x4[keys])
>user  system elapsed
> 130.390   0.000 132.316
> 
> And it's apparently worse than quadratic in time!
> 
> I'm wondering why this subsetting by name is so slow since it
> seems it could be implemented with x4[match(keys, names(x4))],
> which is very fast: only 0.012s!
> 
> This is with R-2.11.0 and R-2.12.0.
> 
> Thanks,
> H.
> 
> 


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Very slow subsetting by name

2010-07-15 Thread Martin Morgan

On 07/15/2010 08:38 AM, Martin Morgan wrote:
> On 07/15/2010 01:12 AM, Hervé Pagès wrote:
>> Hi,
>>
>> I'm subsetting a named vector using character indices.
>> My vector of indices (or keys) is 10x longer than the vector
>> I'm subsetting. All my keys are distinct and only 10% of them
>> are valid (i.e. match a name of the vector being subsetted).
>> It is surprisingly slow:
>>
>> x1 <- 1:1000
>> names(x1) <- paste("a", x1, sep="")
>> keys <- sample(c(names(x1), paste("b", 1:9000, sep="")))
>>> system.time(y1 <- x1[keys])
>>user  system elapsed
>>   0.410   0.000   0.416
>>
>> x2 <- 1:2000
>> names(x2) <- paste("a", x2, sep="")
>> keys <- sample(c(names(x2), paste("b", 1:18000, sep="")))
>>> system.time(y2 <- x2[keys])
>>user  system elapsed
>>   1.730   0.000   1.736
> 
> For what its worth, I think this comes about in the loop starting at
> subscript.c:538, which seems to be there to allow [<-,*,character to
> extend a vector with new named elements
> 
>> x=c(a=1)
>> x["b"] = 2
>> x
> a b
> 1 2
> 
> It seems to be irrelevant (?) for sub-setting per se (though by analogy
> one might expect x["c"] to return a length-1 vector NA with name "c",
> whereas it returns a vector with names NA).
> 
> Seems like the O(n^2) loop through NonNullStringMatch could be replaced
> by look-ups into a hash, or an additional argument could be propagated

this passes make check and does

> x4 <- 1:8000
> names(x4) <- paste("a", x4, sep="")
> keys <- sample(c(names(x4), paste("b", 1:72000, sep="")))
> system.time(y4 <- x4[keys])
   user  system elapsed
  0.092   0.000   0.093
> identical(y4, x4[match(keys, names(x4))])
[1] TRUE

but uses some additional memory.

Martin

Index: src/main/subscript.c
===
--- src/main/subscript.c(revision 52526)
+++ src/main/subscript.c(working copy)
@@ -535,15 +535,17 @@
 }


+SEXP sindx = PROTECT(match(s, s, 0)); /* first match */
 for (i = 0; i < ns; i++) {
sub = INTEGER(indx)[i];
if (sub == 0) {
-   for (j = 0 ; j < i ; j++)
-   if (NonNullStringMatch(STRING_ELT(s, i), STRING_ELT(s, j))) {
-   sub = INTEGER(indx)[j];
-   SET_VECTOR_ELT(indexnames, i, STRING_ELT(s, j));
-   break;
-   }
+j = INTEGER(sindx)[i] - 1;
+if (NA_STRING != STRING_ELT(s, j) &&
+R_NilValue != STRING_ELT(s, j))
+{
+sub = INTEGER(indx)[j];
+SET_VECTOR_ELT(indexnames, i, STRING_ELT(s, j));
+}
}
if (sub == 0) {
if (!canstretch) {
@@ -561,7 +563,7 @@
setAttrib(indx, R_UseNamesSymbol, indexnames);
 if (canstretch)
*stretch = extra;
-UNPROTECT(4);
+UNPROTECT(5);
 return indx;
 }

> to stringSubscript to exit early when names aren't required. Or the call
> to makeSubscript at subset.c:164 could instead be made to matchE in
> unique.c.
> 
> Martin
> 
>>
>> x3 <- 1:4000
>> names(x3) <- paste("a", x3, sep="")
>> keys <- sample(c(names(x3), paste("b", 1:36000, sep="")))
>>> system.time(y3 <- x3[keys])
>>user  system elapsed
>>   8.900   0.010   9.227
>>
>> x4 <- 1:8000
>> names(x4) <- paste("a", x4, sep="")
>> keys <- sample(c(names(x4), paste("b", 1:72000, sep="")))
>>> system.time(y4 <- x4[keys])
>>user  system elapsed
>> 130.390   0.000 132.316
>>
>> And it's apparently worse than quadratic in time!
>>
>> I'm wondering why this subsetting by name is so slow since it
>> seems it could be implemented with x4[match(keys, names(x4))],
>> which is very fast: only 0.012s!
>>
>> This is with R-2.11.0 and R-2.12.0.
>>
>> Thanks,
>> H.
>>
>>
> 
> 


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] parent.frame(1) of a S4 method is not a calling environment.

2010-08-16 Thread Martin Morgan

On 08/15/2010 02:39 PM, Vitaly S. wrote:
> 
> Dear Developers,
> 
> I wonder what are the parent.frame rules for methods. For ordinary functions 
> one
> can call parent.frame() and be sure that it is the environment of a calling
> function. With S4 aparently it is not the case.
> 
> Here is what I have discovered by trial and error so far:
> 
>> setClass("A", contains="vector")
> [1] "A"
>> setGeneric("foo", def=function(a, ...){standardGeneric("foo")})
> [1] "foo"
>>
>> setMethod("foo", signature("A"), 
> +   def=function(a, ...){
> + cat("--pf1--\n")
> + ls(parent.frame(1))
> + ## cat("--pf2--")
> + ## ls(parent.frame(2))
> +   })
> [1] "foo"
>>
>> tf <- function(){
> +   b <- 4
> +   foo(new("A")) #ok
> + }
>>
>> tf() #ok
> --pf1--
> [1] "b"
> 
> The above works like predicted.
> Now, a small change. The "b" argument which is not in the signature, but has a
> role of an additional parameter to the function:
> 
>> setMethod("foo", signature("A"), 
> +   def=function(a, b, ...){
> + cat("--pf1--\n")
> + print(ls(parent.frame(1)))
> + cat("--pf2--\n")
> + print(ls(parent.frame(2)))
> +   })
> [1] "foo"
>>
>> tf()  #oups
> --pf1--
> [1] "a"
> --pf2--
> [1] "b"
>>
> 
> So,  can I be sure that for such functions parent.frame(2) will always work?
> What are the additional rules?

callNextMethod() will cause additional problems; the idea that you'll
grab things from somewhere other than function arguments doesn't seem
like a robust design, even if it's used in some important parts of R.

Martin

> 
> Many thanks,
> Vitaly.
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Weird erratic error and illogical error message, could someone explain this?

2010-09-03 Thread Martin Morgan

On 09/03/2010 04:42 AM, Duncan Murdoch wrote:
> Philippe Grosjean wrote:
>> Hello,
>>
>> It's several days I try to track this bug, and even cannot cook a
>> reproducible example. Yet, it occurs consistently in a long-running
>> task after a variable period of time. Here is an example:
>>   
>
> I would look closely at the other software that is running in your
> long example.  Does it include C (or other external) code?  Look
> closely at that, it might be writing outside it's own allocated
> memory.  Also check for correct protection of intermediate results, if
> you're producing SEXPs in the external code.  (Running under gctorture
> might flush out the bug more quickly if the latter is the problem.)
>
> If you're only running R code, then this looks like a bug in R, but it
> might still be worth trying gctorture to make it reproducible.  We
> won't be able to fix it if we can't reproduce it.
>
> Duncan Murdoch
>> ... my long-running code [as I said, cannot give something simple
>> that produces this bug in a reproducible manner]
>>
>> Error in match(x, table, nomatch = 0L) :
>>  formal argument "nomatch" matched by multiple actual arguments
>>  > traceback()
>> 6: match(x, table, nomatch = 0L)
>> 5: "factor" %in% attrib[["class", exact = TRUE]]
>> 4: structure(.Internal(Sys.time()), class = c("POSIXt", "POSIXct"))
>> 3: Sys.time()
>> 2: chemTrigger() at chemostat_1.0-1.R#1132
>> 1: chemRun()

I think this is a bug in R that has been fixed in the subversion commit
below, and so should be fixed in R-2.11.1.
What is your sessionInfo(), and does your error occur in the devel
version of R?

Martin

r51232 | falcon | 2010-03-09 13:59:48 -0800 (Tue, 09 Mar 2010) | 14 lines
Changed paths:
   M /trunk/src/main/match.c

Fix bug in matchArgs triggered by gc and finalizers

matchArgs was modifying the general purpose bits of SEXPs making up the
'formals' argument to keep track of ARGUSED.  When gc is triggered
inside matchArgs, finalizer code can end up calling matchArgs on the
same function (hence same formals) resulting in corruption of gp bits.

This patch uses an int array allocated on the stack to keep track of
ARGUSED and avoids modifying the SEXPs in formals.  In place modification
of SEXPs in supplied still occurs via ARGUSED/SET_ARGUSED, but should be
safe as long as no new allocating function calls are added to matchArgs.

A reproducible report of this bug was reported here:
https://stat.ethz.ch/pipermail/bioc-sig-sequencing/2010-March/000997.html


>>
>> So, the culprid is a call inside `%in%` (from within structure() in
>> Sys.time()). But I can run millions times `%in%`, or structure(), or
>> Sys.time() on my machine without producing this bug. Arguments at 5:
>> are simple character strings. They don't hurt!
>>
>> Also, I am lost because the message is totally illogical in the
>> context where it appears: I can understand this message here:
>>
>>  > match(1, 2, nomatch = 0L, nomatch = NA)
>> Error in match(1, 2, nomatch = 0L, nomatch = NA) :
>>formal argument "nomatch" matched by multiple actual arguments
>>
>> or here:
>>
>>  > test <- function (...) match(1, ..., nomatch = 0L)
>>  > test(2, nomatch = NA)
>> Error in match(1, ..., nomatch = 0L) :
>>formal argument "nomatch" matched by multiple actual arguments
>>
>> but in the call "match(x, table, nomatch = 0L)" where x is the
>> character string "factor" and table is another character string
>> ("numeric") extracted from a list, I don't understand why it produces
>> this error message. '.Internal(Sys.time())' uses do_systime c code
>> that returns a one-element double... not something that can hurt here?!
>>
>> Can someone explain me, or give me an example where an argument is
>> NOT duplicated in the call (well, as I understand it here) and where
>> one gets such an error message? And why?
>>
>> Many thanks, I am desperate :-(
>>
>> I got this error on R 2.11.1 on Mac OS X 10.6.4, and on R 2.10.1 on
>> Windows XP SP3 (but it does not matter, since I cannot cook a
>> reproducible example).
>>
>> Philippe
>>
>> P.S.: seems related to this:
>> http://finzi.psych.upenn.edu/Rhelp10/2008-June/165101.html
>>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Sexpr in package.Rd fails

2010-09-06 Thread Martin Morgan

If I

f = function() {}
package.skeleton("PkgA", "f", path="/tmp")

and then edit man/PkgA-pacakge.Rd to read in part

\description{
More about what it does (maybe more than one line)
\Sexpr{1}
}

and then

R CMD build PkgA

I end up with

Saving output to ‘/tmp/PkgA/build/PkgA.pdf’ ...
Warning in file.create(to[okay]) :
cannot create file '/tmp/PkgA/build/PkgA.pdf', reason 'No such file or
directory'
Done

This is because the 'build' directory does not exist when the pdf is
being written.

Further, if I

\details{
\tabular{ll}{
Package: \tab PkgA \cr
Type: \tab Package\cr
Version: \tab \Sexpr{packageDescription("PkgA")[["Version"]]} \cr
Date: \tab 2010-09-06\cr
License: \tab What license is it under?\cr
LazyLoad: \tab yes\cr
}
}

I see

* installing the package to process help pages
* building the package manual
Hmm ... looks like a package
Converting Rd files to LaTeX Warning in packageDescription("PkgA") : no
package 'PkgA' was found
Error : /tmp/PkgA/man/PkgA-package.Rd:16: subscript out of bounds

The package appears to be installed for processing help pages without
Sexprs, but not with.

This is from a post on the Bioconductor 'devel' list.

https://stat.ethz.ch/pipermail/bioc-devel/2010-September/002296.html

with

> sessionInfo()
R version 2.12.0 Under development (unstable) (2010-09-05 r52874)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Dispatch method on S3 or S4 class

2010-09-07 Thread Martin Morgan

On 09/06/2010 10:00 PM, Dario Strbenac wrote:
> Hello,
>
> I've been attempting to make a generic method that dispatches on the first 
> argument, which can be either an S3 or an S4 class. This is as far as I've 
> gotten. Any suggestions about what to try next ?
>
> library(aroma.affymetrix)
> library(GenomicRanges)
>
> setGeneric("analyse", function(x, y, ...) standardGeneric("analyse"))
>
> setMethodS3("analyse", "AffymetrixCelSet", function(x, y, z, ...)
> {
>   x;
>   UseMethod("analyse")
> }
> )
>
> setGeneric("analyse")
>
> setMethod("analyse", "GRangesList", function(x, y, a, b, c)
> {
>   x;
> }
> )
I think (no testing on my end) you want

setOldClass("AffymetrixCelSet")

setGeneric("analyse", function(x, y, ...) standardGeneric("analyse"))

setMethod(analyse, "AffymetrixCelSet", function(x, y, z, ...)
{
cat("AffymetrixCelSet\n")
x
})

setMethod(analyse, "GRangesList", function(x, y, a, b, c)
{
cat("GRangesList\n")
x
})

and then by way of reproducible example

> x = analyse(structure(list(), class="AffymetrixCelSet"))
AffymetrixCelSet
> y = analyse(GRangesList())
GRangesList


Martin
>
> Thanks,
>Dario.
>
> --
> Dario Strbenac
> Research Assistant
> Cancer Epigenetics
> Garvan Institute of Medical Research
> Darlinghurst NSW 2010
> Australia
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Recursion error after upgrade to R_2.11.1 [Sec=Unclassified]

2010-10-06 Thread Martin Morgan

On 10/06/2010 06:12 PM, Troy Robertson wrote:
> Hi all,
> 
> After no replies to my previous message I thought I might show some code to 
> demonstrate the change and again seek any explanation for the error now 
> thrown by my code after upgrading from 2.10.1 to 2.11.1.
> 
> Thanks
> Troy
> 
> setClass("PortableObject",
> representation(test1= "character"),
> 
> prototype(  test1   = ""),
>   contains = ".environment"
> )
> 
> setMethod("initialize", "PortableObject",
> function(.Object, ..., .xData=new.env(parent=emptyenv())) {
> .Object <- callNextMethod(.Object, ..., .xData=.xData)
> 
> .obj...@test1 <- "Foo"
> # Following line works under 2.10.1 but now throws
> # Error: evaluation nested too deeply: infinite recursion / 
> options(expressions=)?
> #.Object[["test2"]] <- "Bar"
> # The following does what I want though
> .Object$test3 <- "Baa"
> 
> return(.Object)
> }
> )
> 
> e <- new("PortableObject")

The explicit example does help -- it's clear what bug you are
encountering. Here's the code in R-2.10

> selectMethod("[[<-", ".environment")
Method Definition:

function (x, i, j, ..., value)
{
call <- sys.call()
call[[2]] <- x...@.data
eval.parent(call)
x
}


and 2.11.1

> selectMethod("[[<-", ".environment")
Method Definition:

function (x, i, j, ..., value)
{
.local <- function (x, i, j, ..., exact = TRUE, value)
{
call <- sys.call()
call[[2]] <- x...@.data
eval.parent(call)
x
}
.local(x, i, j, ..., value = value)
}

Apparently the 'exact' argument has been added, and because the method
signature differs from the generic, a .local function is created. That
'sys.call()' originally returned the environment in which the generic
was called, but now it returns the environment in which .local is
defined. And so eval() keeps evaluating .local(). This I think is a bug.

For what it's worth, if I were interested in minimizing copying I would
set up initialize so that it ended with callNextMethod(<...>), on the
hopes that the method eventually called would take care not to make too
many copies on slot assignment.

Martin

> 
> alterEGo <- function(o = "EPOCObject") {
> o...@test1 <- "Boo"
> 
> # Following line works under 2.10.1 but now throws
> # Error: evaluation nested too deeply: infinite recursion / 
> options(expressions=)?
> o[["test2"]] <- "Who"
> # The following does what I want though
> o$test3 <- "Hoo"
> 
> # NOTE: No return
> }
> 
> alterEGo(e)
> e...@test1
> e$test2
> e[["test3"]]
> e...@.xdata[["test3"]]
> ___
> 
> Australian Antarctic Division - Commonwealth of Australia
> IMPORTANT: This transmission is intended for the addressee only. If you are 
> not the
> intended recipient, you are notified that use or dissemination of this 
> communication is
> strictly prohibited by Commonwealth law. If you have received this 
> transmission in error,
> please notify the sender immediately by e-mail or by telephoning +61 3 6232 
> 3209 and
> DELETE the message.
> Visit our web site at http://www.antarctica.gov.au/
> ___
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R 2.12.0 alpha/beta/rc warning: spurious or not?

2010-10-08 Thread Martin Morgan

On 10/08/2010 01:25 PM, Dirk Eddelbuettel wrote:
> 
> With several versions of R 2.12.0 I have been seeing this when running
> 
>   R CMD build Rcpp
> 
> from the SVN sources:
> 
>   [...]
>   Transcript written on Rd2.log.
>   Saving output to '/home/edd/svn/rcpp/pkg/Rcpp/build/Rcpp.pdf' ...
>   Warning in file.create(to[okay]) :
> cannot create file '/home/edd/svn/rcpp/pkg/Rcpp/build/Rcpp.pdf', reson 
> 'No such file or directory'
>   Done
>   [...]
> 
> which looks like a simple case of a missing 'mkdir build'.  Or is there
> something else at work I am overlooking?   NEWS has no hint about this as far
> as I can tell.
> 
> Also, the error disappears when I create a directory build/ inside of my
> package sources.
> 
> It is still present with the RC tarball wrapped up last night (r53227).

Similar to this

  https://stat.ethz.ch/pipermail/r-devel/2010-September/058399.html

I think.

Martin
> 
> Dirk
> 


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] data set mube S4 class to be submitted to Bioconductor

2010-10-12 Thread Martin Morgan

On 10/12/2010 07:38 AM, lan gao wrote:
> Dear R Developers,
> 
> I am developing a package to submit to bioconductor. Right now, a gene-set
> sample data is saved as a data frame. Each row contains a probe set and its
> corresponding gene-set name ( mutiple probe -set may map to a gene-set name)
> .
> 
> Is this type of data format OK to be submitted to bioconductor? Or I have to
> make it  GeneCollectionSet class? ( If so, I have to change a lot of my R
> functions).
> 
> Your resonse is highly appreciated!

See the answer to your original post here

  https://stat.ethz.ch/pipermail/bioconductor/2010-October/035781.html

The 'Bioc-devel' list mentioned there is

  https://stat.ethz.ch/mailman/listinfo/bioc-devel

referenced from

  http://bioconductor.org/help/mailing-list/

Asking on the bioc-devel mailing list will not change the original
answer, but is the appropriate place to ask future questions about
developing Bioconductor packages.

Martin

> 
> 
> Lani
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Loading Cached Weaver Objects

2010-10-19 Thread Martin Morgan

On 10/19/2010 08:33 PM, Dario Strbenac wrote:
> Hello,
> 
> I'm looking for a way to extract objects from what gets created when I Sweave 
> with driver = weaver(). I found where the .Rdata objects are, but when I load 
> one into R, I don't see anything that looks like the objects that were 
> created in that code chunk.
> 
>> load("/home/darstr/r_env_cache/2.11.0/AbsMeth_5/ac047940aaa9cf1a1ec09f1628b13381.RData")
>> ls()
> [1] "cacheEnv" "DEPS" "SESSION"
> 
> What can I try next ?

Hi Dario

Probably useful to ask the maintainer
packageDescription('weaver')$Maintainer. But from looking at the package
code I see

eval_and_cache <- function(sexpr, deps, cacheEnv, cachefile, quiet) {
if (!quiet)
  cat("  COMPUTING... ", file=stderr())
log_debug("computing...")
## We want to pick up inherited stuff during the eval.  So no
## parent=emptyenv().
eval(sexpr, envir=cacheEnv)
DEPS <- deps
SESSION <- sessionInfo()
save(cacheEnv, DEPS, SESSION, file=cachefile)
if (!quiet)
  cat("done.\n", file=stderr())
}

load_from_cache_env <- function(fromEnv, toEnv, hash, sym2hash, updated) {
## The 'updated' arg is a logical flag.  TRUE indicates that
## syms in fromEnv were retrieved from cache but had to be
## recomputed because of a dependency mismatch.  This is allows
## us to detect second order dependency mismatch where the
## expression doesn't change, but we've recomputed.
syms <- ls(fromEnv)
for (sym in syms) {
assign(sym, fromEnv[[sym]], envir=toEnv)
assign(sym, list(hash=hash, updated=updated), envir=sym2hash)
}
}

so would guess that 'cacheEnv' is a environment that contains the result
of evaluating the code in the chunk.

Hope that helps,

Martin

> 
> --
> Dario Strbenac
> Research Assistant
> Cancer Epigenetics
> Garvan Institute of Medical Research
> Darlinghurst NSW 2010
> Australia
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] library verbose option doesn't stop "Loading required package XYZ"

2010-10-20 Thread Martin Morgan

On 10/20/2010 04:25 PM, Dominick Samperi wrote:
> Thanks David, but this doesn't work. Under R 2.12.0 the "Loading required
> packages"
> messages appear under R 2.12.0 when I use, for example:
> 
> Rscript --vanilla -e "{ invisible(library(cxxPack)); cat(sqrt(2)) }"

?suppressPackageStartupMessages

though you'll still get warnings / errors.

Martin

> 
> On Wed, Oct 20, 2010 at 7:00 PM, David Winsemius 
> wrote:
> 
>>
>> On Oct 20, 2010, at 6:49 PM, Dominick Samperi wrote:
>>
>>  Hello,
>>>
>>> The library verbose option does not stop numerous messages
>>> of the form "Loading required package: XYZ". This is inconvenient
>>> when a package is loaded repeatedly using Rscript.
>>>
>>> Is there a way to turn off the "Loading required package"
>>> messages?
>>>
>>
>> ?invisible
>>> invisible(require(Hmisc))
>>> invisible(library(rms)  )  # no console messages
>>
>> I do not think this suppress the messages when you attempt to load a
>> non-existent package. For that you may need to look at the options() setting
>> for warn and warning.expression
>>
>>>
>>> One possible work-around is to make packages Suggested
>>> instead of Required, but this introduces other issues.
>>>
>>> Thanks,
>>> Dominick
>>>
>>>[[alternative HTML version deleted]]
>>>
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>> David Winsemius, MD
>> West Hartford, CT
>>
>>
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] R-devel CMD build fails when vignette present

2010-10-24 Thread Martin Morgan

If I try to build a package with a vignette

  R CMD build Biobase

I see

* checking for file 'Biobase/DESCRIPTION' ... OK
* preparing 'Biobase':
* checking DESCRIPTION meta-information ... OK
* cleaning src
* installing the package to re-build vignettes
* creating vignettes ... OK
* cleaning src
Error in setwd(pkgname) : cannot change working directory
Execution halted

This is because prepare_pkg invoked at tools/R/build.R:605 changes the
directory (e.g., from /tmp/RtmprGGHhU/Rbuild257d6d93 to
/tmp/RtmprGGHhU/Rbuild257d6d93/Biobase), in turn because the on.exit at
build.R:223 is usurped by on.exit at build.R:256/259.

Martin
-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R-devel CMD build fails when vignette present

2010-10-25 Thread Martin Morgan

On 10/25/2010 03:21 AM, Prof Brian Ripley wrote:
> Biobase is not self-contained so I cannot easily test it, but as far as
> I know this is now resolved.

Thank you it is. Martin

> 
> On Sun, 24 Oct 2010, Martin Morgan wrote:
> 
>> If I try to build a package with a vignette
>>
>>  R CMD build Biobase
>>
>> I see
>>
>> * checking for file 'Biobase/DESCRIPTION' ... OK
>> * preparing 'Biobase':
>> * checking DESCRIPTION meta-information ... OK
>> * cleaning src
>> * installing the package to re-build vignettes
>> * creating vignettes ... OK
>> * cleaning src
>> Error in setwd(pkgname) : cannot change working directory
>> Execution halted
>>
>> This is because prepare_pkg invoked at tools/R/build.R:605 changes the
>> directory (e.g., from /tmp/RtmprGGHhU/Rbuild257d6d93 to
>> /tmp/RtmprGGHhU/Rbuild257d6d93/Biobase), in turn because the on.exit at
>> build.R:223 is usurped by on.exit at build.R:256/259.
>>
>> Martin
>> -- 
>> Computational Biology
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>>
>> Location: M1-B861
>> Telephone: 206 667-2793
>>
> 


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] S4 methods for rbind()

2010-10-26 Thread Martin Morgan

On 10/26/2010 03:53 AM, Robin Hankin wrote:
> Hello.
> 
> I am trying to write an S4 method for rbind(). I have a class of objects
> called 'mdm', and I want to be able to rbind() them to one another.
> 
> I do not want the method for rbind() to coerce anything to an mdm object.
> I want rbind(x1,x2,x1,x2) to work as expected [ie rbind() should take any
> number of arguments].
> 
> This is what I have so far:
> 
> 
> setGeneric(".rbind_pair", function(x,y){standardGeneric(".rbind_pair")})
> setMethod(".rbind_pair", c("mdm", "mdm"), function(x,y){.mdm.cPair(x,y)})
> setMethod(".rbind_pair", c("mdm", "ANY"),
> function(x,y){.mdm_rbind_error(x,y)})
> setMethod(".rbind_pair", c("ANY", "mdm"),
> function(x,y){.mdm_rbind_error(x,y)})
> setMethod(".rbind_pair", c("ANY", "ANY"),
> function(x,y){.mdm_rbind_error(x,y)})
> 
> ".mdm_rbind_error" <- function(x,y){
> stop("an mdm object may only be rbinded to another mdm object")
> }
> 
> ".mdm.rbind_pair" <- function(x,y){
> stopifnot(compatible(x,y))
> mdm(rbind(xold(x),xold(y)),c(types(x),types(y))) # this is the "meat" of
> the rbind functionality
> }
> 
> setGeneric("rbind")
> setMethod("rbind", signature="mdm", function(x, ...) {
> if(nargs()<3)
> .mdm_rbind_pair(x,...)
> else
> .mdm_rbind_pair(x, Recall(...))
> })
> 
> 
> But
> 
> 
> LE223:~/packages% sudo R CMD INSTALL ./multivator
> [snip]
> Creating a new generic function for "tail" in "multivator"
> Error in conformMethod(signature, mnames, fnames, f, fdef, definition) :
> in method for ‘rbind’ with signature ‘deparse.level="mdm"’: formal
> arguments (... = "mdm", deparse.level = "mdm") omitted in the method
> definition cannot be in the signature

Hi Robin

try getGeneric("rbind") and showMethods("rbind") after your setGeneric;.
The generic is dispatching on 'deparse.level'. 'deparse.level' is
missing from your method definition, and so can't be used as the
signature for your method. Try to set the ... explicitly as the
signature to be used for dispatch.

setGeneric("rbind",
function(..., deparse.level=1) standardGeneric("rbind"),
signature = "...")

Martin

> Error : unable to load R code in package 'multivator'
> ERROR: lazy loading failed for package ‘multivator’
> * removing ‘/usr/local/lib64/R/library/multivator’
> * restoring previous ‘/usr/local/lib64/R/library/multivator’
> LE223:~/packages%
> 
> 
> I can't understand what the error message is trying to say.
> 
> Can anyone advise on a fix for this?
> 
> 


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] SEXPs and slots

2010-11-12 Thread Martin Morgan

On 11/12/2010 02:31 PM, Patrick Leyshock wrote:
> Hello,
> 
> I've created this class:
> 
> setClass("example",
>representation  (
>   size = "numeric",
>   id= "character"
>)
> )
> 
> Suppose I create a new instance of this class:
> 
>> x <- new("example", 4, "id_value")
> 
> This creates an S4 object with two slots.  Am I correct in thinking that
> slots are "filled" by SEXPs?

Hi Patrick --

If I

> eg = new("example", size=4, id="id_value")

(note the named arguments) and take a peak at the str'ucture of eg, I see

> str(eg)
Formal class 'example' [package ".GlobalEnv"] with 2 slots
  ..@ size: num 4
  ..@ id  : chr "id_value"

so the @size slot is a numeric vector of length 1 containing the value
4. One doesn't really have to know the detailed representation, but one
can find out from

> .Internal(inspect(eg))
@df70e48 25 S4SXP g0c0 [OBJ,NAM(2),gp=0x10,ATT]
ATTRIB:
  @df70ef0 02 LISTSXP g0c0 []
TAG: @769258 01 SYMSXP g1c0 [MARK] "size"
@c0f6db8 14 REALSXP g0c1 [NAM(2)] (len=1, tl=0) 4
TAG: @15b0228 01 SYMSXP g1c0 [MARK,NAM(2)] "id"
@c0f6178 16 STRSXP g0c1 [NAM(2)] (len=1, tl=0)
  @12341c80 09 CHARSXP g0c2 [gp=0x20] "id_value"
TAG: @607ce8 01 SYMSXP g1c0 [MARK,NAM(2),gp=0x4000] "class"
@c0f6d58 16 STRSXP g0c1 [NAM(2),ATT] (len=1, tl=0)
  @96ed08 09 CHARSXP g1c1 [MARK,gp=0x21] "example"
ATTRIB:
  @df70fd0 02 LISTSXP g0c0 []
TAG: @624f70 01 SYMSXP g1c0 [MARK,NAM(2)] "package"
@c0f6d88 16 STRSXP g0c1 [NAM(2)] (len=1, tl=0)
  @67f5e0 09 CHARSXP g1c2 [MARK,gp=0x21,ATT] ".GlobalEnv"

that the 'eg' object is an S4SXP with an attribute that is a LISTSXP.
The LISTSXP has elements that are tagged with SYMSXP representing the
slot name, and values that are REALSXP (for 'size') or STRSXP (for
'id'). The LISTSXP attribute itself has an attribute, which contains
information about the package where the class is defined. With these
hints one can see through the S4 interface to the underlying implementation

> attributes(eg)
$size
[1] 4

$id
[1] "id_value"

$class
[1] "example"
attr(,"package")
[1] ".GlobalEnv"

But probably you have a specific goal in mind, and this is too much
information...

Martin

> 
> Thanks, Patrick
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] SEXPs and slots

2010-11-15 Thread Martin Morgan

On 11/15/2010 07:45 AM, Patrick Leyshock wrote:
> Very helpful, thank you.
> 
> A couple other questions, please:
> 
> 1.  I've got a function written in C, named "my_c_function".  In my R
> code I call this function, passing to it an INTSXP and a STRSXP,
> respectively:
> 
>result <- .Call("my_c_function", int_vector, str_vector)
> 
> The prototype of "my_c_function" is: 
> 
>SEXP my_c_function(SEXP int_vec, SEXP str_vec);
> 
> Within my_c_function I am able to extract the values within the integer
> vector, e.g. I can grab the first value with these lines of code:
> 
>int extracted_value;
>extracted_value = *INTEGER(int_vec);
> 
> What I cannot figure out how to do is extract the value from the
> STRSXP.  I'm assuming that I can create a pointer to a character array,
> then malloc enough memory to hold the value.  Is there an analogous
> operation on "INTEGER" for STRSXPs? 

STRING_ELT(str_vec, 0)

gets the 0th component of str_vec, which is a CHARSXP, i.e., an SEXP for
a character string. The char* can be retrieved with CHAR, so the usual
paradigm is

  const char *x = CHAR(STRING_ELT(str_vec, 0));

note the const-ness of the char* -- it's not mutable, because R is
managing char * memory.

The converse action, of assigning to an element, is

  SET_STRING_ELT(str_vec, 0, mkChar("foo"));

mkChar() is creating a copy (if necessary) of "foo", managing it, and
returning a CHARSXP. Working through protection (which will likely be
your next obstacle ;) in this last example is a good exercise.

There is a parallel operation VECTOR_ELT / SET_VECTOR_ELT for lists.

> 2.  Any good references/resources for developing R?  Nearly all the
> documents I've found are for programming R as a user, not as a
> developer.  I have copies of the documentation, which are very helpful,
> but it'd be helpful to have additional resources to fill in their gaps.

Chambers, 2008, Software for Data Analysis: Programming with R chapters
11 & 12,

Gentleman, 2008, R Programming for Bioinformatics chapter 6

might be helpful, but by the time they arrive you might find that you're
most of the way through the material covered...

I guess my opinion is that Rcpp would not be useful for understanding
R's C layer, whatever its merits for 'getting the job done'.

Martin

> 
> Thank you,
> 
> Patrick
> 
> 
> On Fri, Nov 12, 2010 at 4:36 PM, Martin Morgan  <mailto:mtmor...@fhcrc.org>> wrote:
> 
> On 11/12/2010 02:31 PM, Patrick Leyshock wrote:
> > Hello,
> >
> > I've created this class:
> >
> > setClass("example",
> >representation  (
> >   size = "numeric",
> >   id= "character"
> >)
> > )
> >
> > Suppose I create a new instance of this class:
> >
> >> x <- new("example", 4, "id_value")
> >
> > This creates an S4 object with two slots.  Am I correct in
> thinking that
> > slots are "filled" by SEXPs?
> 
> Hi Patrick --
> 
> If I
> 
> > eg = new("example", size=4, id="id_value")
> 
> (note the named arguments) and take a peak at the str'ucture of eg,
> I see
> 
> > str(eg)
> Formal class 'example' [package ".GlobalEnv"] with 2 slots
>  ..@ size: num 4
>  ..@ id  : chr "id_value"
> 
> so the @size slot is a numeric vector of length 1 containing the value
> 4. One doesn't really have to know the detailed representation, but one
> can find out from
> 
> > .Internal(inspect(eg))
> @df70e48 25 S4SXP g0c0 [OBJ,NAM(2),gp=0x10,ATT]
> ATTRIB:
>  @df70ef0 02 LISTSXP g0c0 []
>TAG: @769258 01 SYMSXP g1c0 [MARK] "size"
>@c0f6db8 14 REALSXP g0c1 [NAM(2)] (len=1, tl=0) 4
>TAG: @15b0228 01 SYMSXP g1c0 [MARK,NAM(2)] "id"
>@c0f6178 16 STRSXP g0c1 [NAM(2)] (len=1, tl=0)
>  @12341c80 09 CHARSXP g0c2 [gp=0x20] "id_value"
>TAG: @607ce8 01 SYMSXP g1c0 [MARK,NAM(2),gp=0x4000] "class"
>@c0f6d58 16 STRSXP g0c1 [NAM(2),ATT] (len=1, tl=0)
>  @96ed08 09 CHARSXP g1c1 [MARK,gp=0x21] "example"
>ATTRIB:
>  @df70fd0 02 LISTSXP g0c0 []
>TAG: @624f70 01 SYMSXP g1c0 [MARK,NAM(2)] "package"
>@c0f6d88 16 STRSXP g0c1 [NAM(2)] (len=1, tl=0)
>  @67f5e0 09 CHARSXP g1c2 [MARK,gp=0x21,ATT] ".GlobalEnv"
> 
> that the 'eg' object is

Re: [Rd] Trying to understand the search path and namespaces

2010-11-15 Thread Martin Morgan

On 11/15/2010 04:56 PM, Hadley Wickham wrote:
>> Well, that's what I thought too.  But:
>>
>> parents <- function(x) {
>>  if (identical(x, emptyenv())) return()
>>  c(environmentName(x), parents(parent.env(x)))
>> }
>>> parents(as.environment("package:devtools"))
>> [1] "package:devtools" "package:methods"  "Autoloads""base"
>>
>> And package:testthat isn't listed there.  (But Autoloads is suggestive...)
> 
> Hmmm, autoloads isn't it:
> 
>> parent.env(parent.env(as.environment("package:devtools")))
> 
> attr(,"name")
> [1] "Autoloads"
>> ls(parent.env(parent.env(as.environment("package:devtools"
> character(0)

1.6 of Writing R Extensions says

Note that adding a name space to a package changes the search strategy.
The package name space comes first in the search, then the imports, then
the base name space and then the normal search path.

I'm not sure of the details, but I think

  parents(getNamespace("devtools"))

will give you what you want, with the gory details in loadNamespace
makeNamespace

Martin


> 
> Hadley
> 


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] S4 setMethod, setGeneric and default arguments

2010-11-22 Thread Martin Morgan

On 11/22/2010 03:23 AM, evilphil wrote:
> 
> 
> 
> anyone?

Hi evilphil --

Your method signature doesn't have 'missing' for its third argument, and
hence isn't the target of dispatch when the generic is invoked with a
missing argument. I guess you'd figured that out and are really asking
whether it's consistent with the S4 design, and I think it is.

Why one might provide a default to an argument that is also dispatched
on seems like a design decision on your part. Maybe because missing
argument dispatch is a common (default) use case? Or to advertise via
the generic args what a typical value might be? But dispatch on multiple
arguments becomes complicated -- the number of combinations of possible
arguments is large, the 'next' method very difficult to reason about --
so might best be avoided if not necessary. The 'signature' argument of
setGeneric allows arguments to be included but not dispatched on.

Martin
-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] setGeneric for residuals, etc

2011-01-24 Thread Martin Morgan

On 01/24/2011 08:37 AM, Kasper Daniel Hansen wrote:
> Johann,
> 
> whether S4 is "better" than S3 is a heated subject.  No-one (I think)
> disputes that S4 is in some sense more flexible (for some suitable
> definition of flexible), but it does incur some performance overhead
> (how much is debatable) and some would argue that it also makes code
> more complicated and harder to debug.
> 
> But take a look at stats4.
> 
> Kasper
> 
> On Mon, Jan 24, 2011 at 11:01 AM, Johann Hibschman  
> wrote:
>> "Janko Thyson"  writes:
>>
 I'm experimenting with a few model-fitting classes of my own.  I'm
 leaning towards using S4 for my classes, but the R functions I'd want
 to override (residuals, predict, etc.) are all S3 methods.
>>
>>> For example, inside your method for 'residuals()', you will
>>> probably just get some data out of a slot of your object and run the S3
>>> function 'residuals(your.slot.data)'. So there's nothing that should make
>>> you nervous in that respect, you're not overwriting anything with your
>>> method. Setting a generic for an existing function is just a necessary step
>>> in order to specify S4 methods for it.
>>
>> Yes, I understand that it's a necessary step in R, but I'm still puzzled
>> as to why it's necessary.  (And by "why", I don't mean the technical
>> point that 'residuals' is not an S4 generic function; I mean why isn't
>> it a S4 generic function already?)
>>
>> In principle, R could be shipped with all S3 functions replaced by S4
>> functions that default to the S3 implementation.  That would be a
>> benefit to everyone writing S4 objects.  The fact that it's not been
>> done seems to imply it would have a cost to people writing S3 objects,
>> so I'm trying to understand what that cost is.

As Kasper mentioned the current S4 implementation has costs in terms of
performance and usability that I suppose make it unappealing as a
'built-in' feature of R.

The current situation, where individual package developers promote a
function to an S4 generic, can lead to many issues. Some of these
represent bugs in the implementation of S4, e.g., incorrect dispatch
with complicated class hierarchies and package dependencies, that are
very challenging for the average developer (me, for instance) to fathom.
There are also more conceptual issues, e.g., two packages can each
create generics that are exported from separate package name spaces,
with the usual rules of the R search path determining which generic is
discovered. This is very confusing to the user, who has been told that
they are using a 'generic'. Documentation is also very confusing, as the
generic is documented in several different places.

These issues become increasingly important as package hierarchies
required for analysis become complicated; the average CRAN package has
relatively few dependencies and for many there will be no problem. The
class hierarchy and collection of packages for a typical Bioconductor
analysis can be quite large and complicated, leading to subtle problems.

Martin

>>
>> Perhaps I'm seeing an implied risk where I really should be seeing a
>> loose federation of developers with disparate interests, and the slow
>> pace of "global" change that implies.
>>
>> Thanks,
>> Johann
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Dealing with R list objects in C/C++

2011-01-26 Thread Martin Morgan

On 01/26/2011 02:56 PM, wayne.zh...@barclayscapital.com wrote:
> Hi,
> 
> I'd like to construct an R list object in C++, fill it with relevant data, 
> and pass it to an R function which will return a different list object back.  
> I have browsed through all the R manuals, and examples under tests/Embedding, 
> but can't figure out the correct way.  Below is my code snippet:
> 
> #include 
> // Rf_initEmbeddedR and other setups already performed
> 
> SEXP arg, ret;
> 
> // this actually creates a pairlist.  I can't find any API that creates a 
> list
> PROTECT(arg = allocList(3));

Allocate a list of length 3 via SEXPTYPE VECSXP

  PROTECT(arg = allocVector(VECSXP, 3));

> 
> // I want the first element to be type integer, second double, and third a 
> vector.
> INTEGER(arg)[0]  = 1;// <- runtime exception: "INTEGER() can 
> only be applied to a 'integer', not a 'pairlist'

set the first element of the list to an integer vector of length 1, and
assign a value

  SET_VECTOR_ELT(arg, 0, allocVector(INTSXP, 1));
  INTEGER(VECTOR_ELT(arg, 0))[0] = 1

or more succinctly

  SET_VECTOR_ELT(arg, 0, ScalarInteger(1));

> REAL(arg)[1] = 2.5;   // control never reached here

and the second element

  SET_VECTOR_ELT(arg, 1, ScalarReal(2.5));

> VECTOR_PTR(arg)[2] = allocVector(REALSXP, 4);

and for the third allocate a REALSXP and then fill

  SET_VECTOR_ELT(arg, 2, allocVector(REALSXP, 4));

next lines should be ok as REAL(VECTOR_ELT(arg, 2))[0] = 10.0; or with
less typing as

  double *x = REAL(VECTOR_ETL(arg, 2));
  x[0] = 10.0; x[1] = 11.0; x[2] = 12.0; x[3] = 13.0;

> REAL(VECTOR_PTR(arg)[2])[0] = 10.0;
> REAL(VECTOR_PTR(arg)[2])[1] = 11.0;
> REAL(VECTOR_PTR(arg)[2])[2] = 12.0;
> REAL(VECTOR_PTR(arg)[2])[3] = 13.0;
> 
> PROTECT(call = lang2(install(entryPoint.c_str()), arg));

not sure where entryPoint.c_str() is coming from, but

 PROTECT(call = lang2(install("fun"), arg));

with some debate about whether install("fun") should be PROTECT'ed.

> 
> ret = R_tryEval(call, R_GlobalEnv, &errorOccurred);

likely PROTECT(ret = ...) while checking errorOccurred, etc.

Hope that helps,

Martin

> 
> 
> I'll be grateful if you can point me to any online docs/samples.
> 
> Thanks in advance,
> Wayne
> 
> ___
> 
> 
i!
>  ce at 1 Churchill Place, London, E14 5HP.  This email may relate to or be 
> sent from other members of the Barclays Group.
> ___
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Dealing with R list objects in C/C++

2011-01-27 Thread Martin Morgan


On 1/27/2011 1:03 PM, wayne.zh...@barclayscapital.com wrote:

Many thanks for the quick reply Martin, your code works as expected.  Next I'd 
like to retrieve heterogeneous data from an SEXP object (let's just pretend 
it's the same type as the one what I'm constructing).  I'm sure the relevant 
APIs are defined in Rinternals.h, do we have API documentations for this header 
file somewhere?

Hi Wayne -- Your best bet might be sections 5 and 6 of

RShowDoc("R-exts")

or the books Dirk mentioned; see also Rdefines.h. Martin

@Dirk: thanks for your help too.  I'm doing something very simple at the 
moment, so I prefer not to bring in Rinside/Rcpp if possible.

Thanks again,
Wayne


-Original Message-
From: Martin Morgan [mailto:mtmor...@fhcrc.org]
Sent: Wednesday, January 26, 2011 10:04 PM
To: Zhang, Wayne: IT (NYK)
Cc: r-devel@r-project.org
Subject: Re: [Rd] Dealing with R list objects in C/C++

On 01/26/2011 02:56 PM, wayne.zh...@barclayscapital.com wrote:

Hi,

I'd like to construct an R list object in C++, fill it with relevant data, and 
pass it to an R function which will return a different list object back.  I 
have browsed through all the R manuals, and examples under tests/Embedding, but 
can't figure out the correct way.  Below is my code snippet:

 #include
// Rf_initEmbeddedR and other setups already performed

 SEXP arg, ret;

 // this actually creates a pairlist.  I can't find any API that creates a 
list
PROTECT(arg = allocList(3));

Allocate a list of length 3 via SEXPTYPE VECSXP

   PROTECT(arg = allocVector(VECSXP, 3));


// I want the first element to be type integer, second double, and third a 
vector.
 INTEGER(arg)[0]  = 1;//<- runtime exception: "INTEGER() can 
only be applied to a 'integer', not a 'pairlist'

set the first element of the list to an integer vector of length 1, and
assign a value

   SET_VECTOR_ELT(arg, 0, allocVector(INTSXP, 1));
   INTEGER(VECTOR_ELT(arg, 0))[0] = 1

or more succinctly

   SET_VECTOR_ELT(arg, 0, ScalarInteger(1));


 REAL(arg)[1] = 2.5;   // control never reached here

and the second element

   SET_VECTOR_ELT(arg, 1, ScalarReal(2.5));


 VECTOR_PTR(arg)[2] = allocVector(REALSXP, 4);

and for the third allocate a REALSXP and then fill

   SET_VECTOR_ELT(arg, 2, allocVector(REALSXP, 4));

next lines should be ok as REAL(VECTOR_ELT(arg, 2))[0] = 10.0; or with
less typing as

   double *x = REAL(VECTOR_ETL(arg, 2));
   x[0] = 10.0; x[1] = 11.0; x[2] = 12.0; x[3] = 13.0;


 REAL(VECTOR_PTR(arg)[2])[0] = 10.0;
 REAL(VECTOR_PTR(arg)[2])[1] = 11.0;
 REAL(VECTOR_PTR(arg)[2])[2] = 12.0;
 REAL(VECTOR_PTR(arg)[2])[3] = 13.0;

 PROTECT(call = lang2(install(entryPoint.c_str()), arg));

not sure where entryPoint.c_str() is coming from, but

  PROTECT(call = lang2(install("fun"), arg));

with some debate about whether install("fun") should be PROTECT'ed.


ret = R_tryEval(call, R_GlobalEnv,&errorOccurred);

likely PROTECT(ret = ...) while checking errorOccurred, etc.

Hope that helps,

Martin



I'll be grateful if you can point me to any online docs/samples.

Thanks in advance,
Wayne

___

This e-mail may contain information that is confidential, privileged or 
otherwise protected from disclosure. If you are not an intended recipient of 
this e-mail, do not duplicate or redistribute it by any means. Please delete it 
and any attachments and notify the sender that you have received it in error. 
Unless specifically indicated, this e-mail is not an offer to buy or sell or a 
solicitation to buy or sell any securities, investment products or other 
financial product or service, an official confirmation of any transaction, or 
an official statement of Barclays. Any views or opinions presented are solely 
those of the author and do not necessarily represent those of Barclays. This 
e-mail is subject to terms available at the following link: 
www.barcap.com/emaildisclaimer. By messaging with Barclays you consent to the 
foregoing.  Barclays Capital is the investment banking division of Barclays 
Bank PLC, a company registered in England (number 1026167) with its registered 
o!

ff

i!

  ce at 1 Churchill Place, London, E14 5HP.  This email may relate to or be 
sent from other members of the Barclays Group.
___

[[alternative HTML version deleted]]

______
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel





--
Dr. Martin Morgan, PhD
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] help with S4 objects: trying to use a "link-glm" as a class in an object definition

2011-01-28 Thread Martin Morgan

On 01/27/2011 08:51 PM, Paul Bailey wrote:
> Hi,
> 
> I'm trying to make a new S4 object with a slot for a "link-glm" object. R 
> doesn't like me have a slot of class "link-glm"
> 
>> class(make.link("probit"))
> [1] "link-glm"

Tell the S4 system that you'd like to use this 'old' class

setOldClass("link-glm")

and things should be ok. Martin

>> setClass("a",representation(item="link-glm"))
> [1] "a"
> Warning message:
> undefined slot classes in definition of "a": item(class "link-glm") 
>> fa <- function() {
> +   new("a",item=make.link("probit"))
> + }> 
>> fa()
> Error in validObject(.Object) : 
>   invalid class "a" object: undefined class for slot "item" ("link-glm")
> 
> # and a link-glm looks like a list to me, so I thought I would tell R it is a 
> list and see what happens
> 
>> setClass("b",representation(item="list"))
> [1] "b"
>> fb <- function() {
> +   new("b",item=make.link("probit"))
> + }
>> fb()
> Error in validObject(.Object) : 
>   invalid class "b" object: invalid object for slot "item" in class "b": got 
> class "link-glm", should be or extend class "list"
> 
> Any advice?
> 
> Regards,
> Paul Bailey
> Ph.D. candidate
> Department of Economics
> University of Maryland
> 
> ## raw code #
> setClass("a",representation(item="link-glm"))
> fa <- function() {
>   new("a",item=make.link("probit"))
> }
> fa()
> setClass("b",representation(item="list"))
> fb <- function() {
>   new("b",item=make.link("probit"))
> }
> fb()
> ###
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] S3 method for S4 object

2011-02-03 Thread Martin Morgan

On 02/03/2011 09:29 AM, Paul Gilbert wrote:
> I am trying to extend an S3 method to work with an S4 object as well
> as the S3 objects it works with, but UseMethod does not seem to
> recognize the S4 class and dispatches to the default method.  Is this
> to be expected or should I be looking for an error in my code?
> 
> If it is not an error in my code, is there an easy way to do this, or
> do I have to convert the generic to S4 and then make  those methods
> deal with the S3 objects?

Hi Paul

See ?Methods and the "Methods for S3 Generic Functions" section.

Martin

> 
> (Using R 2.12.1 on Ubuntu 10.10.)
> 
> Paul Gilbert 
> 
>
>  La version franÃ§aise suit le texte anglais.
> 
> 
>
>  This email may contain privileged and/or confidential information,
> and the Bank of Canada does not waive any related rights. Any
> distribution, use, or copying of this email or the information it
> contains by other than the intended recipient is unauthorized. If you
> received this email in error please delete it immediately from your
> system and notify the sender promptly by email that you have done so.
> 
> 
> 
>
>  Le prÃ©sent courriel peut contenir de l'information privilÃ©giÃ©e ou
> confidentielle. La Banque du Canada ne renonce pas aux droits qui s'y
> rapportent. Toute diffusion, utilisation ou copie de ce courriel ou
> des renseignements qu'il contient par une personne autre que le ou
> les destinataires dÃ©signÃ©s est interdite. Si vous recevez ce
> courriel par erreur, veuillez le supprimer immÃ©diatement et envoyer
> sans dÃ©lai Ã l'expÃ©diteur un message Ã©lectronique pour l'aviser
> que vous avez Ã©liminÃ© de votre ordinateur toute copie du courriel
> reÃ§u.
> 
> [[alternative HTML version deleted]]
> 
> 
> 
> 
> __ R-devel@r-project.org
> mailing list https://stat.ethz.ch/mailman/listinfo/r-devel


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] S4 problems

2011-02-15 Thread Martin Morgan

On 02/14/2011 11:54 PM, robin hankin wrote:
> Hello everybody
> 
> [R-2.12.1]
> 
> I am having difficulty dealing with Oarray objects.
> I have a generic function, call it foo(), and I wish
> to define  a method for Oarray objects.
> 
> I do not have or want a method for regular arrays [actually,
> I want to coerce to an Oarray, and give a warning].
> 
> But setMethod() does not behave as desired, giving
> an error message when I try to define a method for
> Oarray objects.
> 
> Also, if I define a method for array objects, this does not
> give an error message, but neither does it behave as desired,
> as the method is not found when  passing an Oarray object
> to foo().
> 
> 
> LE110:~/packages% R --vanilla --quiet
>> library(Oarray)
>> setGeneric("foo",function(x){standardGeneric("foo")})
> [1] "foo"
>>  setMethod("foo","Oarray",function(x){x})
> in method for ‘foo’ with signature ‘"Oarray"’: no definition for class 
> "Oarray"
> [1] "foo"
>> setMethod("foo","array",function(x){x})
> [1] "foo"
>> a <- Oarray(0,2:3)
>> is.array(a)
> [1] TRUE
>> foo(a)
> Error in function (classes, fdef, mtable)  :
>   unable to find an inherited method for function "foo", for signature 
> "Oarray"
> 
> Three questions:
> 
> Why does the first call to setMethod() give an error message?

Oarray is an S3 class, so

setOldClass("Oarray")

before defining methods on the class.

Hope that helps,

Martin

> Why does (a) not find the method defined for arrays, even though 'a'
> is an array?
> How can I make "foo(a)" behave as desired when 'a' is an object of
> class 'Oarray'?
> 
> 
> 
> 
> 
> 


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] changes in recent R-devel revisions?

2011-03-01 Thread Martin Morgan

On 03/01/2011 03:19 PM, Benilton Carvalho wrote:
> Hi,
> 
> I have a BioC infra-structure package that works fine (I can build,
> check and load it successfully) on revision r53950. The very same
> package fails on r54591 with the error below:
> 
> Error in loadNamespace(package, c(which.lib.loc, lib.loc), keep.source
> = keep.source) :
>   cyclic name space dependency detected when loading ‘oligoClasses’,
> already loading ‘oligoClasses’
> 
> I don't see anything obvious in the name space that would indicate
> cyclic dependency and I wonder:

For what it's worth, saying

> trace(stop, recover)

prior to library(oligoClasses) leads to

 7: loadNamespace(package, c(which.lib.loc, lib.loc), keep.source =
keep.source
 8: methods:::cacheMetaData(ns, TRUE, ns)
 9: getGeneric(f, FALSE, searchWhere, fpkg)
10: tryCatch(loadNamespace(package), error = function(e) e)

where 'package' is oligoClasses in lines 7 and 10, and the 'f' in 9 is
'relocateObject'. Line 10 is evaluated when methods:::.getGeneric
returns NULL.

In oligoClasses we have

oligoClasses/R> grep relocateObject *
AllGenerics.R:setGeneric("relocateObject", function(object, ...)
standardGeneric("relocateObject"))
methods-CNSet.R:relocateObject <- function(object, to){

which I guess is not as intended.

My guess is that setGeneric adds the generic to a cache of some sort
when the name space is created, but doesn't remove it when the generic
is overwritten by a plain function.

No idea why this shows up in the current R revision.

Martin

> 
> a) if there were changes that were meant to affect this;
> 
> b) what is the recommended strategy to solve this issue.
> 
> Thank you very much for any suggestion,
> 
> benilton
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Problem with defining new method for residuals()

2011-03-10 Thread Martin Morgan

On 03/10/2011 08:21 AM, ONKELINX, Thierry wrote:
> Dear all,
> 
> I'm writing a package and I would like to reuse the residuals()
> function. When I use a function which calls the redefined residuals
> (for my custom class) I get an error (see below). It looks like the
> wrong method is used. The strange this is, that when it execute the
> code manually it get no error.

Hi Thierry --

I think this is a bug in the methods package.

Your package promotes stats::residuals to an S4 generic. Normally this
should work.

However, your package Depends: on lme4, which also promotes
stats::residuals to an S4 generic. Apparently, this interferes with your
own attempt to make a generic, even though you do not import lme4 into
your name space. As a consequence of the (putative) bug, your package
sees the original stats::residuals.

A work-around seems to be to importFrom(lme4, residuals) and do NOT
import residuals from stats, so that your own method is associated with
the generic defined in lme4.

Martin

> 
> Any suggestions?
> 
> Best regards,
> 
> Thierry
> 
> The entire source code is at
> svn://scm.r-forge.r-project.org/svnroot/aflp
> 
> The code with the error.
> 
>> normalise(dummy)
> Error in object$na.action : $ operator not defined for this S4 class
>> traceback()
> 5: naresid(object$na.action, object$residuals) at normalise.R#30 4:
> residuals.default(outliers(data)) at normalise.R#30 3:
> residuals(outliers(data)) at normalise.R#30 2:
> nrow(residuals(outliers(data))) at normalise.R#30 1:
> normalise(dummy) #This works fine
>> data <- dummy nrow(residuals(outliers(data)))
> [1] 0
> 
> NAMESPACE importFrom(stats, residuals, resid, hclust, princomp) 
> exportPattern(".")
> 
> METHODS setMethod("residuals", signature(object = "AFLP.outlier"), 
> function(object, ...){ object@Residual } )
> 
> setMethod("resid", signature(object = "AFLP.outlier"), 
> function(object, ...){ object@Residual } )
> 
> 
>
> 
ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek team Biometrie &
> Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium
> 
> Research Institute for Nature and Forest team Biometrics & Quality
> Assurance Gaverstraat 4 9500 Geraardsbergen Belgium
> 
> tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be
> 
> To call in the statistician after the experiment is done may be no
> more than asking him to perform a post-mortem examination: he may be
> able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
> 
> The plural of anecdote is not data. ~ Roger Brinner
> 
> The combination of some data and an aching desire for an answer does
> not ensure that a reasonable answer can be extracted from a given
> body of data. ~ John Tukey
> 
> __ R-devel@r-project.org
> mailing list https://stat.ethz.ch/mailman/listinfo/r-devel


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Using missing() in a S4 method with extra arguments

2011-03-11 Thread Martin Morgan

On 03/11/2011 02:07 AM, Andreas Borg wrote:
> Hi all,
> 
> I have a function which makes use of missing() to determine which
> arguments are provided in the call - basically, there are two sets of
> arguments that map to different strategies the function uses to fulfill
> its task. After conversion to an S4 generic I've run into the problem
> that if a method uses extra arguments that are not in the signature of
> the generic, usage of missing() fails. The following example exemplifies
> this:
> 
>setGeneric("fun", function(x=0, y=0, ...) standardGeneric("fun"))
># both methods should output if the second argument is missing
>setMethod("fun", "character", function(x=0, y=0, ...) missing(y))
>setMethod("fun", "numeric", function(x=0, y=0, z=0, ...) missing(y))
> 
>fun("a") # this works fine
>fun(1) # this gives "FALSE

Hi Andreas --

if you're testing for the missing-ness of y, and y is in the function
signature, then use that for dispatch

   setMethod(fun, c("character", "missing"),
 function(x=0, y=0, z=0, ...) "missing")
   setMethod(fun, c("character", "ANY"),
 function(x=0, y=0, z=0, ...) "not missing")

Since you're dispatching on x and y, it doesn't really make sense (to me
;) to assign default values to them. Testing for missing-ness of z would
I think have to rely on NA / NULL or other sentinel.

Martin
> 
> I've understood so far that this is due to the fact that the "numeric"
> method in this example is rewritten to:
> 
>function (x = 0, y = 0, ...)
>{
>.local <- function (x = 0, y = 0, z = 0, ...)
>missing(y)
>.local(x, y, ...)
>}
> 
> The call to .local evaluates y and it is no more missing.
> 
> Is there any alternative that works in this case? Or is there a chance
> that missing() might be changed to work in this case in the near future?
> 
> Of course I know I could set NA or NULL as default values and check for
> these, but there are reasons I want to have legal default values for all
> arguments.
> 
> Best regards,
> 
> Andreas
> 
> Andreas Borg
> Medizinische Informatik
> 
> UNIVERSITÄTSMEDIZIN
> der Johannes Gutenberg-Universität
> Institut für Medizinische Biometrie, Epidemiologie und Informatik
> Obere Zahlbacher Straße 69, 55131 Mainz
> www.imbei.uni-mainz.de
> 
> Telefon +49 (0) 6131 175062
> E-Mail: b...@imbei.uni-mainz.de
> 
> Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte
> Informationen. Wenn Sie nicht der
> richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben,
> informieren Sie bitte sofort den
> Absender und löschen Sie diese Mail. Das unerlaubte Kopieren sowie die
> unbefugte Weitergabe
> dieser Mail und der darin enthaltenen Informationen ist nicht gestattet.
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R_HOME path getting munged in inst/doc/Makefile on Windows

2011-03-21 Thread Martin Morgan


On 03/21/2011 07:22 PM, Simon Urbanek wrote:


On Mar 21, 2011, at 9:07 PM, Dan Tenenbaum wrote:


Hello,

I have come across two separate packages that have a Makefile in inst/doc
which use the R_HOME variable.

In both cases, the path to R_HOME gets munged in such a way that commands
that include R_HOME fail on Windows:

For example, one Makefile, for the xmapcore package (
https://hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/xmapcore/username/password:
readonly) has this:

R=${R_HOME}/bin/R
SUITE=../cookbook/delia.R
[...]
${R} --vanilla --verbose<  ${SUITE}

the output of trying to build this package includes:

* creating vignettes ... ERROR
E:\biocbld\BBS-2~1.8-B\R/bin/R --vanilla --verbose<  ../cookbook/delia.R
E:biocbldBBS-2~1.8-BR/bin/R: not found
make: *** [pdf] Error 127
Error in tools::buildVignettes(dir = ".") : running 'make' failed
Execution halted

It seems R_HOME is not getting resolved to a valid path. That's strange
because R CMD echo shows the right thing:

E:\sandbox>\biocbld\bbs-2.8-bioc\R\bin\R CMD echo %R_HOME%
e:/biocbld/bbs-2.8-bioc/R

That's a nice path with all forward slashes and no funny 8.3 paths with
tildes.  But it looks like when R_HOME is invoked in a Makefile, the
resulting path has a mix of forward and backslashes,


Nope, at least not in R from CRAN:

Makevars:
all:
echo R_HOME: $(R_HOME)

[...]
echo R_HOME: c:/PROGRA~1/R/R-212~1.2
R_HOME: c:/PROGRA~1/R/R-212~1.2

But I see that you have custom rhome setting (BBS...) so changes are that is 
the culprit - the rhome for that R build is set incorrectly to contain 
backslashes.


Not sure about the Makefile, but see

https://stat.ethz.ch/pipermail/r-devel/2011-March/060260.html

also on my own machine

Z:\> R CMD INSTALL Biobase
* installing to library 'C:\Users\User\Documents/R/win-library/2.13'
* installing *source* package 'Biobase' ...
...

** testing if installed package can be loaded
Error: '\U' used without hex digits in character string starting "C:\U"
Execution halted
ERROR: loading failed
* removing 'C:\Users\User\Documents/R/win-library/2.13/Biobase'
* restoring previous 'C:\Users\User\Documents/R/win-library/2.13/Rsamtools'

whereas under 2.12

Z:\>R CMD INSTALL Biobase
* installing to library 'C:\Users\User\Documents/R/win-library/2.12'
* installing *source* package 'Biobase' ...

and everything is fine.

Z:\> R --version
R version 2.13.0 alpha (2011-03-17 r54849)
Copyright (C) 2011 The R Foundation for Statistical Computing
ISBN 3-900051-07-0



Cheers,
Simon





and gets translated
into 8.3 style, and the resulting path is not valid for finding R
executables.

Note that R_HOME is defined within R; I don't also have it defined at the
shell level:

E:\sandbox>echo %R_HOME%
%R_HOME%

Any ideas?
Thanks,
Dan


sessionInfo()

R version 2.13.0 alpha (2011-03-18 r54865)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] another import puzzle

2011-03-26 Thread Martin Morgan


On 03/26/2011 11:14 AM, Ben Bolker wrote:


   Dear list,

   I have another (again possibly boneheaded) puzzle about importing,
again encapsulated in a nearly trivial package.  (The package is posted
at<http://www.math.mcmaster.ca/bolker/misc/coefsumtest_0.001.tar.gz>.)

   The package consists (only) of the following S3 method definitions:

coeftab<- function(object, ...) UseMethod("coeftab",object)
coeftab.default<- function(object,...) {
   print(class(summary(object)))
   coef(summary(object))
}

   The NAMESPACE tries to pull in the necessary bits and pieces from lme4
to extract summaries and coefficients:

export("coeftab","coeftab.default")
importClassesFrom(lme4,"mer","summary.mer")
importMethodsFrom(lme4,"coef","summary","show","print")


It 'turns out' that base::summary is an S3 generic. Matrix creates an S4 
generic that is distinct from base::summary (e.g., so that the default 
behavior of summary isn't altered for packages that want to have nothing 
to do with Matrix). Dispatch needs to go through the generic. lme4 has 
methods on Matrix::summary, not on base::summary, so without the 
Matrix::summary generic your object never sees the summary method for 
lme4 objects.


So you need to Import: Matrix and importFrom(Matrix, summary).

Martin Morgan


exportMethods("coef","summary","show","print")
exportClasses("mer","summary.mer")
S3method(coeftab,default)

   The package passes the routine parts of R CMD check.  The following
test shows that, with lme4 loaded, coef(summary([object of class
"mer"])) works in the global environment, but not in a function defined
inside the namespace of the package.

   The output ends with:


coeftab.default(gm1)

[1] "summaryDefault" "table"
Error in object$coefficients : $ operator is invalid for atomic vectors
Calls: coeftab.default ->  coef ->  coef ->  coef.default

   which indicates that inside the function, summary() is calling
summary.default instead of seeing the summary method for "mer" objects ...


   I have (re-re-re-)read the appropriate R-exts section, without luck,
and tried various minor variations (e.g. import()ing all of lme4,
changing the order of the directive, ...).

   Help ... ?

   sincerely
 Ben Bolker

=
test.R
=

library(coefsumtest)
library(lme4)

gm1<- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
  family = binomial, data = cbpp)

coef(summary(gm1)) ## works

f<- function(g) {
   coef(summary(g))
}
f(gm1)  ## works

coeftab.default(gm1) ##

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] R-2-13-alpha invalid subscript when checking for unstated dependencies

2011-03-27 Thread Martin Morgan


R version 2.13.0 alpha (2011-03-27 r55091)

This error occurs when R CMD check'ing a Bioconductor package:

* checking for unstated dependencies in R code ... WARNING
Error in e[keep] : invalid subscript type 'list'
Execution halted
See the information on DESCRIPTION files in the chapter 'Creating R
packages' of the 'Writing R Extensions' manual.

It is because the author has a sub-expression to 'require' that exceeds 
the width.cutoff=60L default argument of deparse, e.g.,


require(gsub("onereallyquitelongstring",
 "anotherreallyquitelongstring",
  variablename),
character.only=TRUE)

This results in a list rather than vector in (one of) tools::QC.R:4106 
or 4258.

--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Problem with dyn.load in R 2.13.0

2011-04-13 Thread Martin Morgan


On 04/13/2011 11:34 AM, Dirk Eddelbuettel wrote:


On 13 April 2011 at 13:00, Terry Therneau wrote:
| I have a test directory for the survival suite, and dyn.load has ceased
| to work in it.  Below shows the log:
|
| tmt1075% R --vanilla
|
| R version 2.12.2 (2011-02-25)
| Copyright (C) 2011 The R Foundation for Statistical Computing
| ISBN 3-900051-07-0
| Platform: x86_64-unknown-linux-gnu (64-bit)
[...]
|>  dyn.load('survival.so')
|>  q()
[...]
|
| tmt1076% R13 --vanilla
|
| R version 2.13.0 RC (2011-04-11 r55409)
| Copyright (C) 2011 The R Foundation for Statistical Computing
| ISBN 3-900051-07-0
| Platform: x86_64-unknown-linux-gnu (64-bit)
[...]
|>  dyn.load('survival.so')
| Error in dyn.load("survival.so") :
|   unable to load shared object
| '/people/biostat2/therneau/research/surv/Rtest/survival.so':
|   libR.so: cannot open shared object file: No such file or directory
|>  q()
|
| --
|
|  Is the issue that the .so file must have been created with the R2.13
| script?  That's not what the error message says, however.  It almost
| looks like it is ignoring my first argument and looking instead for
| "libR".

What does 'ldd /path/to/your/survial.so' say?  Does the system find libR.so?


Maybe R CMD ldd /path/to/your/survival.so to pick up whatever 
environment R configures.


Martin



I have no issues whatsoever on my Ubuntu box with the packages distributed
via CRAN (and now also a PPA if you want alpha/beta/rc builds) based on the
underlying Debian package I maintain.

I have a (pretty visible) hourly cronjob that drives our littler /usr/bin/r
frontend to do the CRANberries summaries---and even while /usr/bin/r was last
built under R 2.11.1, it continued to work merrily under R 2.12.0, 2.12.1,
2.12.2, and 2.13.0 prereleases:

edd@max:~$ r --version | head -4
r ('littler') version 0.1.3
 svn revision 178 as of 2010-01-05 20:57:41
 built at 20:08:17 on Oct 11 2010
 using GNU R Version 2.11.1 (2010-05-31)
edd@max:~$

edd@max:~$ R --version | head -1
R version 2.13.0 RC (2011-04-07 r55373)
edd@max:~$

Both use the same shared library version libR.so from whichever current
r-base-core is installed.

So I see no particular breakage.   Recompiling your project may also be a start.

Hth, Dirk




--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] DESCRIPTION file and Rd examples

2011-04-15 Thread Martin Morgan


On 04/15/2011 11:18 AM, Simon Urbanek wrote:


On Apr 14, 2011, at 11:00 PM, Dario Strbenac wrote:


I have a confusing error from R CMD check that I don't get when running the 
example manually by hand.

In the \examples section of an Rd file, I create a GRanges object, then I call 
a function with the GRanges object, whose first 2 lines are

require(GenomicRanges)


require() is doesn't guarantee that the package will load, so I think what you 
meant to write was more

if (require(GenomicRanges, quietly=TRUE)) {
  ...


annoDF<- as.data.frame(anno) # anno is the GRanges object.

and that second line gives:

Error in as.data.frame.default(anno) :
  cannot coerce class 'structure("GRanges", package = "GenomicRanges")' into a 
data.frame
Calls: annoGR2DF ... annoGR2DF ->  .local ->  as.data.frame ->  
as.data.frame.default


Try IRanges::as.data.frame(anno)

I'm guessing that your call finds base::as.data.frame, perhaps because 
some earlier example has already require'd GenomicRanges, and that it's 
definition of as.data.frame has been masked some time in between.


It's too complicated to debug in detail; R CMD check produces a file 
.Rcheck/-Ex.R that contains the compiled example code, and it 
is in the evaluation of this file that the error occurs. So you could 
dissect it to discover the gory details.


Martin



I have GenomicRanges listed in my Imports: field, and IRanges in the Suggests: 
of the DESCRIPTION file (it's require()d elsewhere). I'm trying to avoid 
putting packages in Depends: , so my package loads fast. Any tips of what I'm 
not understanding properly ?

Thanks.

--
Dario Strbenac
Research Assistant
Cancer Epigenetics
Garvan Institute of Medical Research
Darlinghurst NSW 2010
Australia

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Package Name Not Found Warning

2011-04-19 Thread Martin Morgan


On 04/19/2011 05:00 PM, Dario Strbenac wrote:

Hello,

I've got a DESCRIPTION file with a the first line:

Package: Repitools

But, when I run R CMD INSTALL Repitools I get:

* installing *source* package Repitools ...
** R
** data
** inst
** preparing package for lazy loading
Warning in FUN(X[[1L]], ...) :
   Created a package name, "2011-04-20 09:05:40", when none found


For what it's worth, this comes up when a class is being created in an 
environment that is not the global environment or does not have a 
variable .packageName, apparently added early in the name space creation 
process. You can mimic this with


  setClass("A", where=new.env())

or

  local({ setClass("A", where=environment()) })

Kind of doubt whether you've actually done something like that in your 
package, but maybe it twigs something...


Also, if you add

  trace(methods::getPacakgeName, quote(print(where)))

or

  trace(warning, quote(print(sys.calls(

somewhere early in your package (the top of the first file to be 
collated) you'll get messages that might point to where things are going 
wrong.


Hope that helps,

Martin


** help
*** installing help indices
** building package indices ...
** testing if installed package can be loaded

* DONE (Repitools)

It looks like it knows about the package name at the start and end of the 
process, but not in the middle of it.

Loading the packing in an R session and looking at the sessionInfo shows the 
package name was properly processed. Is this a spurious warning ?

I'm using:
R version 2.13.0 (2011-04-13)
Platform: x86_64-unknown-linux-gnu (64-bit) (actually Ubuntu 10.10)

--
Dario Strbenac
Research Assistant
Cancer Epigenetics
Garvan Institute of Medical Research
Darlinghurst NSW 2010
Australia

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Package Name Not Found Warning

2011-04-19 Thread Martin Morgan


On 04/19/2011 08:00 PM, Dario Strbenac wrote:

Ah, I think it's happening because I have

setOldClass(AffymetrixCelSet) in my package.


if AffymetrixCelSet is from aroma.affymetrix, then the line above results in

> setOldClass(AffymetrixCelSet)
Error in x[length(x):1L] : object of type 'closure' is not subsettable

maybe you meant

setOldClass("AffymetrixCelSet")

but if I put that in a package that either Depends: or not on 
aroma.affymetrix then I don't see the warning in your original report. 
I'm not sure you've diagnosed the problem correctly?


Martin



I guess I need to use there where argument. But how do I have this call outside 
any S4 functions, but without having to load aroma.affymetrix when my package 
loads ?

 Original message 

Date: Tue, 19 Apr 2011 19:23:20 -0700
From: Martin Morgan
Subject: Re: [Rd] Package Name Not Found Warning
To: d.strbe...@garvan.org.au
Cc: r-devel@r-project.org

On 04/19/2011 05:00 PM, Dario Strbenac wrote:

Hello,

I've got a DESCRIPTION file with a the first line:

Package: Repitools

But, when I run R CMD INSTALL Repitools I get:

* installing *source* package Repitools ...
** R
** data
** inst
** preparing package for lazy loading
Warning in FUN(X[[1L]], ...) :
Created a package name, "2011-04-20 09:05:40", when none found


For what it's worth, this comes up when a class is being created in an
environment that is not the global environment or does not have a
variable .packageName, apparently added early in the name space creation
process. You can mimic this with

   setClass("A", where=new.env())

or

   local({ setClass("A", where=environment()) })

Kind of doubt whether you've actually done something like that in your
package, but maybe it twigs something...

Also, if you add

   trace(methods::getPacakgeName, quote(print(where)))

or

   trace(warning, quote(print(sys.calls(

somewhere early in your package (the top of the first file to be
collated) you'll get messages that might point to where things are going
wrong.

Hope that helps,

Martin


** help
*** installing help indices
** building package indices ...
** testing if installed package can be loaded

* DONE (Repitools)

It looks like it knows about the package name at the start and end of the 
process, but not in the middle of it.

Loading the packing in an R session and looking at the sessionInfo shows the 
package name was properly processed. Is this a spurious warning ?

I'm using:
R version 2.13.0 (2011-04-13)
Platform: x86_64-unknown-linux-gnu (64-bit) (actually Ubuntu 10.10)

--
Dario Strbenac
Research Assistant
Cancer Epigenetics
Garvan Institute of Medical Research
Darlinghurst NSW 2010
Australia

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793



--
Dario Strbenac
Research Assistant
Cancer Epigenetics
Garvan Institute of Medical Research
Darlinghurst NSW 2010
Australia



--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Make as.factor an S3 generic?

2011-04-20 Thread Martin Morgan

as.factor / as.ordered is not written as a generic. This differs from 
as.numeric, as.matrix, and other as.*. The following seems to address 
this and does not break make check-all.


FWIW, the patch is against r55563, because with r55564 I see

/home/mtmorgan/src/R-devel/src/main/dounzip.c:75:15: error: storage size 
of ‘dt’ isn’t known
/home/mtmorgan/src/R-devel/src/main/dounzip.c:88:5: warning: implicit 
declaration of function ‘mktime’

make[3]: *** [dounzip.o] Error 1
make[3]: *** Waiting for unfinished jobs
make[3]: Leaving directory `/home/mtmorgan/bin/R-devel/src/main'
make[2]: *** [R] Error 2
make[2]: Leaving directory `/home/mtmorgan/bin/R-devel/src/main'
make[1]: *** [R] Error 1
make[1]: Leaving directory `/home/mtmorgan/bin/R-devel/src'
make: *** [R] Error 1


Index: src/library/base/R/factor.R
===
--- src/library/base/R/factor.R (revision 55563)
+++ src/library/base/R/factor.R (working copy)
@@ -45,7 +45,9 @@
 }

 is.factor <- function(x) inherits(x, "factor")
-as.factor <- function(x) if (is.factor(x)) x else factor(x)
+as.factor.default <- function(x, ...)
+if (is.factor(x)) x else factor(x, ...)
+as.factor <- function(x, ...) UseMethod("as.factor")

 ## Help old S users:
 category <- function(...) .Defunct()
@@ -245,7 +247,10 @@
 ordered <- function(x, ...) factor(x, ..., ordered=TRUE)

 is.ordered <- function(x) inherits(x, "ordered")
-as.ordered <- function(x) if(is.ordered(x)) x else ordered(x)
+as.ordered.default <- function(x, ...)
+if(is.ordered(x)) x else ordered(x, ...)
+as.ordered <- function(x, ...)
+UseMethod("as.ordered")

 Ops.ordered <- function (e1, e2)
 {
Index: src/library/base/man/factor.Rd
===
--- src/library/base/man/factor.Rd  (revision 55563)
+++ src/library/base/man/factor.Rd  (working copy)
@@ -10,7 +10,9 @@
 \alias{is.factor}
 \alias{is.ordered}
 \alias{as.factor}
+\alias{as.factor.default}
 \alias{as.ordered}
+\alias{as.ordered.default}
 \alias{is.na<-.factor}
 \alias{Math.factor}
 \alias{Ops.factor}
@@ -40,8 +42,8 @@
 is.factor(x)
 is.ordered(x)

-as.factor(x)
-as.ordered(x)
+as.factor(x, \dots)
+as.ordered(x, \dots)

 addNA(x, ifany=FALSE)
 }

--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Make as.factor an S3 generic?

2011-04-20 Thread Martin Morgan

On 04/20/2011 10:13 AM, William Dunlap wrote:

-Original Message-
From: r-devel-boun...@r-project.org
[mailto:r-devel-boun...@r-project.org] On Behalf Of Martin Morgan
Sent: Wednesday, April 20, 2011 9:56 AM
To: R-devel@r-project.org
Subject: [Rd] Make as.factor an S3 generic?

as.factor / as.ordered is not written as a generic. This differs from
as.numeric, as.matrix, and other as.*. The following seems to address
this and does not break make check-all.

Why did you decide to make as.factor() generic instead of factor()?

Hi Bill --

short-sighted consistency with other as.*; implied simplicity of 
coercion rather than construction.

Martin

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Make as.factor an S3 generic?

2011-04-20 Thread Martin Morgan


On 04/20/2011 11:38 AM, Prof Brian Ripley wrote:

Well, lots of functions are not generic. We do ask you to give a case
for such changes ... where is it?


The specific need started with base::lapply, which calls base::as.list. 
An S4 method "as.list,A-method" defined in a name space isn't seen by 
base::as.list, whereas as.list.A is (see discussion in ?Methods around 
/f3.myClass). The question is from this post on a Bioconductor mailing list


https://stat.ethz.ch/pipermail/bioc-sig-sequencing/2011-April/001979.html

[partly answering Bill's question here] A list() constructor could be 
tricky to implement (dealing with variable numbers of arguments and the 
S4 rules for dispatch on ...), whereas as.list.A is trivial (slot 
extraction, in my case). Having arrived at an easy solution, I marched 
through the other coercion functions with only minor set-backs 
(as.double.A instead of as.numeric.A) until factor.


Martin



On Wed, 20 Apr 2011, Martin Morgan wrote:


as.factor / as.ordered is not written as a generic. This differs from
as.numeric, as.matrix, and other as.*. The following seems to address
this and does not break make check-all.

FWIW, the patch is against r55563, because with r55564 I see


OS-specific 


/home/mtmorgan/src/R-devel/src/main/dounzip.c:75:15: error: storage
size of ‘dt’ isn’t known
/home/mtmorgan/src/R-devel/src/main/dounzip.c:88:5: warning: implicit
declaration of function ‘mktime’
make[3]: *** [dounzip.o] Error 1
make[3]: *** Waiting for unfinished jobs
make[3]: Leaving directory `/home/mtmorgan/bin/R-devel/src/main'
make[2]: *** [R] Error 2
make[2]: Leaving directory `/home/mtmorgan/bin/R-devel/src/main'
make[1]: *** [R] Error 1
make[1]: Leaving directory `/home/mtmorgan/bin/R-devel/src'
make: *** [R] Error 1


Index: src/library/base/R/factor.R
===
--- src/library/base/R/factor.R (revision 55563)
+++ src/library/base/R/factor.R (working copy)
@@ -45,7 +45,9 @@
}

is.factor <- function(x) inherits(x, "factor")
-as.factor <- function(x) if (is.factor(x)) x else factor(x)
+as.factor.default <- function(x, ...)
+ if (is.factor(x)) x else factor(x, ...)
+as.factor <- function(x, ...) UseMethod("as.factor")

## Help old S users:
category <- function(...) .Defunct()
@@ -245,7 +247,10 @@
ordered <- function(x, ...) factor(x, ..., ordered=TRUE)

is.ordered <- function(x) inherits(x, "ordered")
-as.ordered <- function(x) if(is.ordered(x)) x else ordered(x)
+as.ordered.default <- function(x, ...)
+ if(is.ordered(x)) x else ordered(x, ...)
+as.ordered <- function(x, ...)
+ UseMethod("as.ordered")

Ops.ordered <- function (e1, e2)
{
Index: src/library/base/man/factor.Rd
===
--- src/library/base/man/factor.Rd (revision 55563)
+++ src/library/base/man/factor.Rd (working copy)
@@ -10,7 +10,9 @@
\alias{is.factor}
\alias{is.ordered}
\alias{as.factor}
+\alias{as.factor.default}
\alias{as.ordered}
+\alias{as.ordered.default}
\alias{is.na<-.factor}
\alias{Math.factor}
\alias{Ops.factor}
@@ -40,8 +42,8 @@
is.factor(x)
is.ordered(x)

-as.factor(x)
-as.ordered(x)
+as.factor(x, \dots)
+as.ordered(x, \dots)

addNA(x, ifany=FALSE)
}

--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel






--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R CMD check and Suggests Packages

2011-04-29 Thread Martin Morgan


On 04/29/2011 07:36 AM, Simon Urbanek wrote:


On Apr 29, 2011, at 9:52 AM, Steve Lianoglou wrote:


Hi,

On Fri, Apr 29, 2011 at 5:29 AM, Prof Brian Ripley
  wrote:

On Fri, 29 Apr 2011, Dario Strbenac wrote:


Hello,

In my description file, I have an example data package in Suggests: that
I've deleted from my library to test what the user who doesn't have it will
experience.

However, R CMD check won't even pass my package :

* checking package dependencies ... ERROR
Package required but not available: RepitoolsExamples

Why would it have to be installed, if it's only a data package, that isn't
needed in any of my code ? The manual also says "In particular, large
packages providing “only” data for examples or vignettes should be listed in
‘Suggests’ rather than ‘Depends’ in order to make lean installations
possible."


Why suggest a package that 'isn't needed in any of my code'?

I suspect that is a lie, and some of your code does use it: if some it is
needed to fully check the package.   There is a option to 'R CMD check' to
enable the check to go ahead without all the dependencies, so please do
re-read the manuals to find it.


Here's a stab in the dark:

Perhaps the OP has code in some \examples{} section for some help
(*.Rd) file that then tries to load data from the "suggested" package?



But that is a valid use of Suggests: as long as it is guarded against that 
package not being present.

The point here is that the default is to check those packages yet Dario doesn't 
like it and thus should turn it off if he feels so inclined. (That's my 
interpretation without knowing the package).

Cheers,
Simon




The code in the \examples{} sections of *.Rd files are run during R
CMD check ... so, if you're trying to load data from a suggested
package that may not be installed, perhaps you can wrap those code
blocks with \dontrun{}.



Not really - that defeats the purpose - it will never be checked int hat case!


Maybe also useful to distinguish between package check and INSTALL, 
where during the former the Suggests: package must be present (modulo 
setting additional flags) but in the latter may not be. The Bioconductor 
build system, for instance, would install RepitoolsExamples  before 
running R CMD check Repitools. A user might successfully install 
Repitools without installing RepitoolsExamples.


"if (require(RepitoolsExamples))" in an example on a man page is much 
preferred to \dontrun{}, as Simon mentions.


Martin Morgan






For more info: this is covered in the "Writing R Extensions," but is
also described here:
http://stackoverflow.com/questions/1454211/what-does-not-run-mean-in-r-help-pages

HTH,
-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  | Memorial Sloan-Kettering Cancer Center
  | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Windows Rzlib.dll gzopen and friends

2011-04-29 Thread Martin Morgan

Several Bioconductor packages were expecting Windows Rzlib.dll to 
provide gzopen / gzread / gzseek / gzgets / gzrewind / gzclose. Are 
these gone for good, viz., r55624 ?


Martin
--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Reference Classes: Accessing methods via [[...]], bug?

2011-05-01 Thread Martin Morgan


On 05/01/2011 03:09 PM, John Chambers wrote:

Yes, as presented on that site it makes a little more sense:

"While experimenting with the new reference classes in R I noticed some
odd behaviour if you use the "[[ ]]" notation for methods
(X[["doSomething"]] instead of X$doSomething). This notation works for
fields, but I initially thought it wouldn't work for methods until I
found that if you execute "class(X$doSomething)" you can then use "[[
]]" afterwards. The simple example below illustrates the point."

For reference classes, "[[" is not meant to be used either for fields or
methods. That it "works" at all is an artifact of the implementation
using environments. And arguably the failure to throw an error in that
circumstance is a bug.

Please use the API as described in the ?ReferenceClasses documentation.
These are encapsulated methods, in the usual terminology, with the
operator "$" playing the role normally assigned to "." in other languages.

A separate but related issue: It is possible to define S4 methods for
reference classes, as discussed in a previous thread, arguably also an
artifact in that a reference class is implemented as an S4 class of the
same name. These are functional methods, associated with a generic
function, and so outside the encapsulation paradigm.

It would be interesting to get some experience and opinions on whether
this is a good idea or not. It breaks encapsulation, in that the
behavior of the class can no longer be inferred from the class
definition alone. On the other hand, it is convenient and relates to
"operator overloading" in some other languages.


I have written 'show' methods for reference classes (is there another 
way to pretty-print them?) and S4 methods that dispatch to reference 
methods (in particular, yield(x) on connection-like classes dispatching 
to x$yield()). The latter partly to provide end-user familiarity 
(limiting need for the beleaguered user to have to learn yet another 
syntax for invoking methods, though maybe hiding hints about important 
differences in object behavior -- I am dreading the introductory class 
where one tries to explain S3, S4, and reference classes), and partly to 
provide a distinction between a 'developer' API and a user API (again 
with questionable merits).


Martin



John

On 4/30/11 7:54 PM, Hadley Wickham wrote:

If this message is garbled for anyone else, the original question on
stackoverflow is here:
http://stackoverflow.com/questions/5841339/using-notation-for-reference-class-methods


Hadley

On Sat, Apr 30, 2011 at 11:35 AM, Chad
Goymer wrote:


I've been trying to use methods for reference classes via the
notation "[[...]]" (X[["doSomething"]] rather than X$doSomething),
but it failed to work. However, I did find that if you use the usual
"$" notation first, "[[...]]" can be used afterwards. The following
simple example illustrates the point:

setRefClass("Number", + fields = list(+ value = "numeric"+ ),+
methods = list(+ addOne = function() {+ value<<- value + 1+ }+ )+ )>
X<- new("Number", value = 1)> X[["value"]][1] 1
X[["addOne"]]()Error: attempt to apply non-function>
class(X[["addOne"]]) # NULL[1] "NULL"
class(X$addOne)[1] "refMethodDef"attr(,"package")[1] "methods"
X[["addOne"]]()> X[["value"]][1] 2> class(X[["addOne"]])[1]
"refMethodDef"attr(,"package")[1] "methods"

Is this a bug?
Chad Goymer

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel







__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Windows Rzlib.dll gzopen and friends

2011-05-06 Thread Martin Morgan

This change has significant consequences for Windows packages using R's 
zlib, including packages providing core Bioconductor functionality. Are 
the changes in r55624 meant to be long-term?


Martin

On 04/29/2011 03:13 PM, Martin Morgan wrote:

Several Bioconductor packages were expecting Windows Rzlib.dll to
provide gzopen / gzread / gzseek / gzgets / gzrewind / gzclose. Are
these gone for good, viz., r55624 ?

Martin



--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Devel version of R CMD build may create packages with incomplete doc/ directories

2011-05-08 Thread Martin Morgan

An .Rnw file in the vignettes/ directory might reasonably \usepackage or 
\include .sty, .bib, or other (e.g., image) files in the same directory 
or a sub-directory. The .Rnw file is copied to inst/doc and hence 
installed, but the additional files are not. This means that the 
installed .Rnw files are not useful. What can the package author do to 
ensure that the installed .Rnw file is fully functional? Or is it more 
appropriate to add only Stangled and pdf versions of .Rnw to the 
installed doc/ directory?


R version 2.14.0 Under development (unstable) (2011-05-07 r55798)
--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Windows Rzlib.dll gzopen and friends

2011-05-17 Thread Martin Morgan


R developers,

If the absence of a more complete zlib is a permanent change, then is 
the best strategy to build and distribute a package that wraps a 
functional zlib? This seems like it will create problems with symbol 
resolution at the least. Including a zlib in each package that uses it 
also seems like a poor choice. What other strategies are recommended?


Martin

On 05/06/2011 09:58 AM, Martin Morgan wrote:

This change has significant consequences for Windows packages using R's
zlib, including packages providing core Bioconductor functionality. Are
the changes in r55624 meant to be long-term?

Martin

On 04/29/2011 03:13 PM, Martin Morgan wrote:

Several Bioconductor packages were expecting Windows Rzlib.dll to
provide gzopen / gzread / gzseek / gzgets / gzrewind / gzclose. Are
these gone for good, viz., r55624 ?

Martin






--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Reference Classes/S4 Classes: can method dispatch check superclasses BEFORE resorting to method for "ANY"?

2011-05-27 Thread Martin Morgan

On 05/27/2011 06:13 AM, Janko Thyson wrote:

Dear list,

is it possible that method dispatch checks for superclasses/virtual
classes before checking "ANY"?

I'd like to build a generic initialization method for all my Reference
Class (say "MyDataFrame") objects by having them inherit from class, say
"MyRefClassVirtual" (which would have to be a virtual S4 class; there
are no virtual Reference Classes, are there?)

Reference classes can be virtual; 'initialize' is a reference method, 
not an S4 method.

.A <- setRefClass("A", contains="VIRTUAL",
methods=list(
  initialize=function(..., msg="initialize,AA") {
  callSuper(...)
  message(msg)
  .self
  }))
.AA <- setRefClass("AA", contains="A")

> .A$new()
Error in methods::new(def, ...) :
  trying to generate an object from a virtual class ("A")
> .AA$new()
initialize,AA
An object of class "AA"

Martin

The problem is that 'getRefClass("MyDataFrame")$new' calls (I think) the
method that was written for "ANY". Thus even though I write a explicit
initialize method for class "MyRefClassVirtual" which I should be called
for "MyDataFrame" as it inherits from this class, this method will never
be called because "ANY beats anything else".

So, I think I'd like to tell the method somehow to check for
superclass/virtual classes *before* resorting to "ANY".

Is that possible?

Regards,
Janko

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Query super- and subclasses of a class: is there a better way than to use 'completeClassDefinition()'

2011-05-31 Thread Martin Morgan


On 05/30/2011 08:54 AM, Janko Thyson wrote:

Dear List,

when I first started to use S4 classes, I used the function
'completeClassDef()' in order to see the super- and subclasses of a
certain class:


Hi Janko -- I think 'complete' is meant as an adverb here; what you 
might want is names(getClassDef("A")@subclasses) (see 
slotNames(class(getClassDef("A"))) for other useful info).


Martin



setClass(Class="A", representation=list(a="numeric"))
setClass(Class="B", contains="A", representation=list(b="character"))
# Super
x<- completeClassDefinition("B")
attributes(x)
names(x@contains)
# Sub
x<- completeClassDefinition("A")
attributes(x)
names(x@subclasses)

This also does the trick for Reference Classes for me. However, I
re-read the respective section on the help page and wondered if I should
be more careful about using this function for this purpose as the page says:
"|completeClassDefinition: |Completes the definition of |Class|,
relative to the class definitions visible from environment |where|. If
|doExtends| is |TRUE|, complete the super- and sub-class information.
This function is called when a class is defined or re-defined."

So here are my questions:
1) Is it safe to call 'completeClassDef()' explicitly or can anything be
"overwritten" by this?
2) Is there a better way to query the super-/subclasses of a certain
S4/Reference Class?

Best regards and I'd like to take this opportunity to express my
gratitude to everyone on this list who takes the time to provide such
great help!
Janko




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] S4 Slot assignment within function

2011-06-04 Thread Martin Morgan


On 06/03/2011 02:03 PM, mcguirebc wrote:

Is there a simple way to assign values to S4 slots from within a function?

Doing this doesn't work:


assign_slot<-function(x){


assign("OBJECT@slot",x,envir=parent.env(environment())

}


assign_slot(x)


All I get from this is a new object with the name OBJECT@slot, the slot
assignment of OBJECT doesn't change.

I have thought about solutions such as eval(parse()) to pull this off, but
would prefer not to ugly up the code.

Thoughts??

I have searched rather thoroughly, but it is possible I overlooked
something. If I did, apologies.


Maybe you're looking for ?ReferenceClasses rather then S4 classes?

Martin



-Brian







--
View this message in context: 
http://r.789695.n4.nabble.com/S4-Slot-assignment-within-function-tp3572077p3572077.html
Sent from the R devel mailing list archive at Nabble.com.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] S4 class, passing argument names to function, modify original

2011-06-04 Thread Martin Morgan


On 06/04/2011 03:07 AM, soeren.vo...@uzh.ch wrote:

Hello, an S4 class "Foo" is defined with a setter, $. For several reasons, the 
setter calls a function, .foo.update(). However, bypassing the argument names of the 
setter does not work. Question 1: Why not and how can I fix this? Question 2: What is the 
way to define either the function or the setter to modify the original object (not 
returning the modified copy of it an overwrite the original by assignment)? Thanks, Sören

setClass("Foo",
   representation(
 N = "numeric"
   ),
   prototype(
 N = 1
   )
)

.foo.update<- function(object, ...) {
   args<- list(...)
   for (i in slotNames("Foo")[pmatch(names(args), slotNames("Foo"), 
nomatch=0)]) {
 slot(object, i)<- args[[i]]
 # indeed more to do here
 return(object)
   }
}


Since names(args) is 'name', and 'name' is not a slot of 'Foo', the 
return of pmatch is 0 and .foo.update returns NULL. Put return(object) 
outside the for loop.



setReplaceMethod("$", "Foo",
   function(x, name, value) {
 x<- .foo.update(x, name=value)


here your intention is that name=value to be substituted with N=99, but 
you end up with name=99. You could arrange to parse this correctly, but 
this isn't usually what you _want_ to do and I don't really understand 
what you're trying to accomplish. Maybe


.foo.update <- function(object, name, value, ...)
{
slot(object, name) <- value
## some other stuff
object
}

Hope that helps a bit.

Martin


 x
   }
)

x<- new("Foo")
x
x$N<- 99
x # NULL

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

1 2 3 4 5 >

1 - 100 of 446 matches

Mail list logo