[Rd] Misuse of get(getOption("device"))() in packages

2008-04-26 Thread Prof Brian Ripley
Quite a few packages have used this construct to launch a device, but it 
has several flaws.  It's not clear in most cases that a package really 
needs to launch a new device (R will do so if needed), but 2.7.0 provides 
a function dev.new() to do so.  (If you really need this in a GPL-ed 
package that must run in R < 2.7.0, consider copying dev.new.)

You cannot assume any arguments for the default device, and lines like

   get(getOption("device"))(width=8,height=8)

may throw an error or give you an 8x8 pixel device.  Even worse is

   get(getOption("device"))(8, 8)

as for e.g. X11() and quartz() the first argument is not a dimension, and

   get(getOption("device"))(width=3.55,height=3.55,rescale="fixed")

applies to just one device (and this user at least does not want to get 
his magnifying glass out to examine plots forced into 6% of his screen 
area).

It is better to adapt your plots to the user's environment than to attempt 
to adapt his/her environment to your plots.  (dev.size(), also new in 
2.7.0, may help you do so.)


The more important flaws are:

1) As from R 2.5.0, options("device") can be a character string or a 
function.  If the latter, this construct does not work.

2) ?options carefully describes the search path for the device if 
specified by name.  get() in a package will not do the same thing, and is 
more likely to pick up another object of the same name.

3) What most of these packages do is to open another device when one is 
already open.  If there is no screen device, this can be disastrous as 
e.g. there may already be a pdf() device open that is writing to 
Rplots.pdf, and  get(getOption("device))() will start another pdf() device 
writing to Rplots.pdf: the result is almost certainly a corrupt PDF file.

The intention is that in R 2.8.0 the out-of-the-box default devices will 
all be functions, so point 1 will apply.  (dev.new will be more 
sophisticated there.)  Many of the packages concerned are now showing 
ERROR in the 'r-devel Linux x86_64' section of

http://cran.r-project.org/web/checks/check_summary.html


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Bug in R 2.7 for over long lines (crasher+proposed fix!) (PR#11281)

2008-04-26 Thread Peter Dalgaard
[EMAIL PROTECTED] wrote:
> OK, I am just sending it here too as it looks like r-devel@r-project.org
> is not the right place:
>   
I think it was seen there too, just that noone got around to reply. In 
R-bugs, there's a filing system so that it won't be completely forgotten...

However, your mail seems to have gotten encoded in quoted-printable, you 
might want to follow up with a cleaned version. (Just keep the  
(PR#11281) in the header).
> =EF=BB=BFOn Fri, 2008-04-25 at 08:48 +0200, Soeren Sonnenburg wrote:
>   
>> While trying to fix swig & R2.7 I actually discovered that there is a
>> bug in R 2.7 causing a crash (so R & swig might actually work):
>> =20
>> the bug is in ./src/main/gram.c  line 3038:
>> =20
>> } else { /* over-long line */
>> fixthis --> char *LongLine =3D (char *) malloc(nc);
>> if(!LongLine)
>> error(_("unable to allocate space for source line %
>> 
> d"), xxlineno);
>   
>> strncpy(LongLine, (char *)p0, nc);
>>  bug -->LongLine[nc] =3D '\0';
>> SET_STRING_ELT(source, lines++,
>>mkChar2((char *)LongLine));
>> free(LongLine);
>> =20
>> note that LongLine is only nc chars long, so the LongLine[nc]=3D'\0'
>> 
> might
>   
>> be an out of bounds write. the fix would be to do
>> =20
>> =EF=BB=BFchar *LongLine =3D (char *) malloc(nc+1);
>> =20
>> in line 3034
>> =20
>> Please fix and thanks to dirk for the debian r-base-dbg package!
>> 
>
> Looking at the code again there seems to be another bug above this for
> the MAXLINESIZE test too:
>
> if (*p =3D=3D '\n' || p =3D=3D end - 1) {
> nc =3D p - p0;
> if (*p !=3D '\n')
> nc++;
> if (nc <=3D MAXLINESIZE) {
> strncpy((char *)SourceLine, (char *)p0, nc);
> bug2 -->SourceLine[nc] =3D '\0';
> SET_STRING_ELT(source, lines++,
>mkChar2((char *)SourceLine));
> } else { /* over-long line */
> char *LongLine =3D (char *) malloc(nc+1);
> if(!LongLine)
> error(_("unable to allocate space for source line %d"),
> xxlineno);
> bug1 -->strncpy(LongLine, (char *)p0, nc);
> LongLine[nc] =3D '\0';
> SET_STRING_ELT(source, lines++,
>mkChar2((char *)LongLine));
> free(LongLine);
> }
> p0 =3D p + 1;
> }
>
>
> So I guess the test would be for nc < MAXLINESIZE above or to change
> SourceLine to have MAXLINESIZE+1 size.
>
> Alternatively as the strncpy manpage suggests do this for all
> occurrences of strncpy
>
>strncpy(buf, str, n);
>if (n > 0)
>buf[n - 1]=3D =E2=80=99\0=E2=80=99;
>
> this could even be made a makro / helper function ...
>
> And another update: This does fix the R+swig crasher for me (tested)!
>
> Soeren
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>   


-- 
   O__   Peter Dalgaard Ă˜ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Bug in R 2.7 for over long lines (crasher+proposed fix!) (PR#11284)

2008-04-26 Thread p . dalgaard
[EMAIL PROTECTED] wrote:
> OK, I am just sending it here too as it looks like [EMAIL PROTECTED]
g
> is not the right place:
>  =20
I think it was seen there too, just that noone got around to reply. In=20
R-bugs, there's a filing system so that it won't be completely forgotten.=
=2E.

However, your mail seems to have gotten encoded in quoted-printable, you =

might want to follow up with a cleaned version. (Just keep the =20
(PR#11281) in the header).
> =3DEF=3DBB=3DBFOn Fri, 2008-04-25 at 08:48 +0200, Soeren Sonnenburg wro=
te:
>  =20
>> While trying to fix swig & R2.7 I actually discovered that there is a
>> bug in R 2.7 causing a crash (so R & swig might actually work):
>> =3D20
>> the bug is in ./src/main/gram.c  line 3038:
>> =3D20
>> } else { /* over-long line */
>> fixthis --> char *LongLine =3D3D (char *) malloc(nc);
>> if(!LongLine)
>> error(_("unable to allocate space for source line %
>>=20
> d"), xxlineno);
>  =20
>> strncpy(LongLine, (char *)p0, nc);
>>  bug -->LongLine[nc] =3D3D '\0';
>> SET_STRING_ELT(source, lines++,
>>mkChar2((char *)LongLine));
>> free(LongLine);
>> =3D20
>> note that LongLine is only nc chars long, so the LongLine[nc]=3D3D'\0'=

>>=20
> might
>  =20
>> be an out of bounds write. the fix would be to do
>> =3D20
>> =3DEF=3DBB=3DBFchar *LongLine =3D3D (char *) malloc(nc+1);=

>> =3D20
>> in line 3034
>> =3D20
>> Please fix and thanks to dirk for the debian r-base-dbg package!
>>=20
>
> Looking at the code again there seems to be another bug above this for
> the MAXLINESIZE test too:
>
> if (*p =3D3D=3D3D '\n' || p =3D3D=3D3D end - 1) {
> nc =3D3D p - p0;
> if (*p !=3D3D '\n')
> nc++;
> if (nc <=3D3D MAXLINESIZE) {
> strncpy((char *)SourceLine, (char *)p0, nc);
> bug2 -->SourceLine[nc] =3D3D '\0';
> SET_STRING_ELT(source, lines++,
>mkChar2((char *)SourceLine));
> } else { /* over-long line */
> char *LongLine =3D3D (char *) malloc(nc+1);
> if(!LongLine)
> error(_("unable to allocate space for source line %d"),=

> xxlineno);
> bug1 -->strncpy(LongLine, (char *)p0, nc);
> LongLine[nc] =3D3D '\0';
> SET_STRING_ELT(source, lines++,
>mkChar2((char *)LongLine));
> free(LongLine);
> }
> p0 =3D3D p + 1;
> }
>
>
> So I guess the test would be for nc < MAXLINESIZE above or to change
> SourceLine to have MAXLINESIZE+1 size.
>
> Alternatively as the strncpy manpage suggests do this for all
> occurrences of strncpy
>
>strncpy(buf, str, n);
>if (n > 0)
>buf[n - 1]=3D3D =3DE2=3D80=3D99\0=3DE2=3D80=3D99;
>
> this could even be made a makro / helper function ...
>
> And another update: This does fix the R+swig crasher for me (tested)!
>
> Soeren
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>  =20


--=20
   O__   Peter Dalgaard =D8ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] returning vectors of unknown size from C (with .C)

2008-04-26 Thread Ramon Diaz-Uriarte
On Sat, Apr 26, 2008 at 3:19 AM, Duncan Murdoch <[EMAIL PROTECTED]> wrote:
> Ramon Diaz-Uriarte wrote:
>
> > Dear All,
> >
> > In a package, I am using ".C" to call some C functions. In one case,
> > the number of elements of the return vectors are not known in R before
> > the C call. (Two of the vectors are integers, the third is vector of
> > character strings).
> >
> > Passing from R a vector of the maximum possible size would be a huge
> > waste. I understand one alternative is to use ".Call", but I'd rather
> > avoid it if I can (all of the code seems working except for the return
> > of values into R). Another would be to write to a file from C and then
> > read that into R, but this looks very ugly. Are there any other
> > reasonable alternatives, or should I just use .Call?
> >
> >
>
>  .Call is usually easiest, but another possibility is to have two entry
> points:  one to calculate how much space you need, a second to pass in a
> vector that's the right size to hold the result.
>


You mean making two successive calls to the C code? The problem is
that the size of the result is not known until the result is obtained
(in my C code, the underlying structure is a linked list that gets
stretched as needed as the computation proceeds). So I would not know
"where to leave the result from C" in between the two calls to C.

Best,

R.




>  Duncan Murdoch
>
> > Thanks,
> >
> > R.
> >
> >
> >
> >
> >
>
>



-- 
Ramon Diaz-Uriarte
Statistical Computing Team
Structural Biology and Biocomputing Programme
Spanish National Cancer Centre (CNIO)
http://ligarto.org/rdiaz

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] returning vectors of unknown size from C (with .C)

2008-04-26 Thread Prof Brian Ripley
On Sat, 26 Apr 2008, Ramon Diaz-Uriarte wrote:

> On Sat, Apr 26, 2008 at 3:19 AM, Duncan Murdoch <[EMAIL PROTECTED]> wrote:
>> Ramon Diaz-Uriarte wrote:
>>
>>> Dear All,
>>>
>>> In a package, I am using ".C" to call some C functions. In one case,
>>> the number of elements of the return vectors are not known in R before
>>> the C call. (Two of the vectors are integers, the third is vector of
>>> character strings).
>>>
>>> Passing from R a vector of the maximum possible size would be a huge
>>> waste. I understand one alternative is to use ".Call", but I'd rather
>>> avoid it if I can (all of the code seems working except for the return
>>> of values into R). Another would be to write to a file from C and then
>>> read that into R, but this looks very ugly. Are there any other
>>> reasonable alternatives, or should I just use .Call?
>>>
>>>
>>
>>  .Call is usually easiest, but another possibility is to have two entry
>> points:  one to calculate how much space you need, a second to pass in a
>> vector that's the right size to hold the result.
>>
>
>
> You mean making two successive calls to the C code? The problem is
> that the size of the result is not known until the result is obtained
> (in my C code, the underlying structure is a linked list that gets
> stretched as needed as the computation proceeds). So I would not know
> "where to leave the result from C" in between the two calls to C.

But that is possible (you malloc the memory for a local copy in the rist 
call), and rpart does something like it.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] returning vectors of unknown size from C (with .C)

2008-04-26 Thread Ramon Diaz-Uriarte
On Sat, Apr 26, 2008 at 11:38 AM, Prof Brian Ripley
<[EMAIL PROTECTED]> wrote:
> On Sat, 26 Apr 2008, Ramon Diaz-Uriarte wrote:
>
>
> > On Sat, Apr 26, 2008 at 3:19 AM, Duncan Murdoch <[EMAIL PROTECTED]>
> wrote:
> >
> > > Ramon Diaz-Uriarte wrote:
> > >
> > >
> > > > Dear All,
> > > >
> > > > In a package, I am using ".C" to call some C functions. In one case,
> > > > the number of elements of the return vectors are not known in R before
> > > > the C call. (Two of the vectors are integers, the third is vector of
> > > > character strings).
> > > >
> > > > Passing from R a vector of the maximum possible size would be a huge
> > > > waste. I understand one alternative is to use ".Call", but I'd rather
> > > > avoid it if I can (all of the code seems working except for the return
> > > > of values into R). Another would be to write to a file from C and then
> > > > read that into R, but this looks very ugly. Are there any other
> > > > reasonable alternatives, or should I just use .Call?
> > > >
> > > >
> > > >
> > >
> > >  .Call is usually easiest, but another possibility is to have two entry
> > > points:  one to calculate how much space you need, a second to pass in a
> > > vector that's the right size to hold the result.
> > >
> > >
> >
> >
> > You mean making two successive calls to the C code? The problem is
> > that the size of the result is not known until the result is obtained
> > (in my C code, the underlying structure is a linked list that gets
> > stretched as needed as the computation proceeds). So I would not know
> > "where to leave the result from C" in between the two calls to C.
> >
>
>  But that is possible (you malloc the memory for a local copy in the rist
> call), and rpart does something like it.
>

Aha, thanks, I didn't know it was doable (or easy). I'll look at the
rpart code. One further question, though, what is "the rist call"?

Thanks,

R.


>
>  --
>  Brian D. Ripley,  [EMAIL PROTECTED]
>  Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>  University of Oxford, Tel:  +44 1865 272861 (self)
>  1 South Parks Road, +44 1865 272866 (PA)
>  Oxford OX1 3TG, UKFax:  +44 1865 272595
>



-- 
Ramon Diaz-Uriarte
Statistical Computing Team
Structural Biology and Biocomputing Programme
Spanish National Cancer Centre (CNIO)
http://ligarto.org/rdiaz

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] median methods

2008-04-26 Thread Rob Hyndman
Can we please have a ... argument in median() to make it possible to pass
arguments to specific methods.


_
Rob J Hyndman
Professor of Statistics, Monash University
Editor-in-Chief, International Journal of Forecasting
http://www.robhyndman.info/

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] returning vectors of unknown size from C (with .C)

2008-04-26 Thread David Henderson
Hola Ramon!

>>  But that is possible (you malloc the memory for a local copy in the rist
>> call), and rpart does something like it.
>> 
> 
> Aha, thanks, I didn't know it was doable (or easy). I'll look at the
> rpart code. One further question, though, what is "the rist call"?

I think Brian meant to type "first call".

Thanks!!

Dave H
-- 
David Henderson, Ph.D.
1535 NW 51st ST
Seattle, WA 98107
206-794-8552
[EMAIL PROTECTED]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] median methods

2008-04-26 Thread Prof Brian Ripley
On Sat, 26 Apr 2008, Rob Hyndman wrote:

> Can we please have a ... argument in median() to make it possible to pass
> arguments to specific methods.

Not without a reasoned case -- see 'Writing R Extensions' as to why it is 
a non-trivial change that affects all existing methods (and there are 
some) -- also S4 setMethod("median") calls (which there are too).

There is also an argument as to where the ... should come -- probably

function (x, ..., na.rm = FALSE)

(but that could break some existing calls), and why should na.rm be on the 
generic as well (it is not for mean nor quantile, for example)?

So there have to be some pretty compelling arguments in favour.


> _
> Rob J Hyndman
> Professor of Statistics, Monash University
> Editor-in-Chief, International Journal of Forecasting
> http://www.robhyndman.info/
>
>   [[alternative HTML version deleted]]

Hmm, we do explicitly ask you not to do send that.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RFC: What should ?foo do?

2008-04-26 Thread Duncan Murdoch
On 25/04/2008 2:47 PM, Prof Brian Ripley wrote:
> On Fri, 25 Apr 2008, Deepayan Sarkar wrote:
> 
>> For what it's worth, I use ?foo mostly to look up usage of functions
>> that I know I want to use, and find it perfect for that (one benefit
>> over help() is that completion works for ?). The only thing I miss is
>> the ability to do the equivalent of help("foo", package = "bar");
>> ?bar::foo gives the help page for "::". Perhaps that would be
>> something to consider for addition.
> 
> That fits most naturally with the (somewhat technical) idea that bar::foo 
> becomes a symbol and not a function call.  I believe that several of think 
> that is in principle a better idea, but no one has as yet (AFAIK) explored 
> the ramifications.
> 
> However, 5 mins looking at the sources suggests that it is easy to do.


And you already did.  Thanks!

I'm going to make the following change soon (in R-devel).

??foo

will now be like help.search("foo").  This will work with your change, 
so ??utils::foo will limit the search to the utils package.  This is 
also quite easy.  A more difficult thing I'd like to do is to broaden 
the search to look outside the man pages, but that's a lot harder, and I 
haven't started on it.

I will also follow Hadley's suggestion and change the format of the 
help.search results, so you can just cut and paste after a question mark 
to look up the particular topic, e.g.  ??foo gives

utils::citEntry Writing Package CITATION Files

Type '?PKG::FOO' to inspect entry 'PKG::FOO TITLE'.

I haven't touched the case of ?foo failing; I'll want to try it for a 
while to decide whether I like it best as is:

 > ?foo
No documentation for 'foo' in specified packages and libraries:
you could try '??foo'

or whether it should just automatically call help.search, or something 
in between.

Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel