[Rd] Postscript can be very slow in R 2.13-0

2011-04-14 Thread Stephen Milborrow

Viewing certain postscript files under R 2.13-0 is very slow, at least using
GSview on my 64 bit Windows 7 system.  To see this, compare how long it 
takes to display these two files (fast.ps was created by removing the /srgb 
stuff in .ps.prolog):


   www.milbo.org/postscript/slow.ps

   www.milbo.org/postscript/fast.ps

The code to produce the above two files is here:

   www.milbo.org/postscript/slow-postscript.R

This is not exactly a bug, but is it a known issue?  The hack used to
produce fast.ps above is fragile and not suitable for regular use.  Is there
a option to the postscript() function (or could one be added) that would
sacrifice sRGB support for speed?

Stephen Milborrow
www.milbo.users.sonic.net

Version info:

I'm using GSview 4.9 and Ghostscript 8.64.


R.version

  _
platform   x86_64-pc-mingw32
arch   x86_64
os mingw32
system x86_64, mingw32
status
major  2
minor  13.0
year   2011
month  04
day13
svn rev55427
language   R
version.string R version 2.13.0 (2011-04-13)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Guidelines for S3 regression models

2015-06-30 Thread Stephen Milborrow

Given how much documentation is available on R coding in general, it is
surprising how little is available specifically on writing model code.
Researchers who come up with a new method of regression, and who want to
write an S3 model for that method, must currently go all the way back to the 
Venables and Ripley S programming book.



On 26.06.2015 14:09, Stephen Milborrow wrote:
> Once we have built a regression model, we typically want to use the
> model for further processing, such as making predictions from the model
> or plotting the residuals.  Unfortunately, for many packages on CRAN
> this can be difficult.
>
> For example, some models don't have a residuals method and don't save
> the call or data --- so you can't tell how to generate the residuals
> from the model object itself.
>
> A common snag is that for some models the new data for predict() has to
> be a matrix; for others it has to be a data.frame.  This places an
> unnecessary burden on the user when both data.frames and matrices can
> easily be supported by predict.
>
> To mitigate such issues, I'm going out on a limb and presenting some
> guidelines for writers of S3 regression model functions (this document
> is currently part of the plotmo package):
> http://www.milbo.org/doc/modguide.pdf

On 26.06.2015 16:41, Achim Zeileis wrote:
I think this is a nice and useful starting point. It's probably not
comprehensive (yet) but will surely help.

You could add something more about writing the formula interface and the
correct processing of model.frame, terms, model.response, model.matrix,
model.weights, model.offset. Especially for models with linear predictors
the latter two can be very useful and are often not hard to implement. In
case the model has multiple parts or multiple responses, the "Formula"
package (and its vignette) might also be helpful.

As for the S3 methods, I would omit coefficients, fitted.values, and resid
from the list. These dispatch to coef, fitted, and residuals anyway. For
inference it would also be very useful to add nobs(), df.residual(),
vcov(), and logLik() and/or deviance() where applicable. An overview which
lists some (but not all) useful methods is in Table 1 of
vignette("betareg", package = "betareg").

For coef() and vcov() it is useful/important that the names and dimension
match. Then Wald tests can be easily computed in functions like
car::linearHypothesis(), car::deltaMethod(), lmtest::waldtest(), or
lmtest::coeftest().


Thanks for these, I'll update the document.

Stephen Milborrow

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] legitimate use of :::

2013-08-23 Thread Stephen Milborrow

To avoid the NOTEs (which often triggers a 'pls fix' upon submission to
CRAN), I simply copied/pasted these functions to my package, but this
seems wasteful.


An issue is how one acknowledges the author of the cut and pasted code.

Assume that for one reason or another the original function can't easily be
made available via NAMESPACEs and so to avoid CRAN submission complaints you
are forced into cut and copying.  If the copied code is somewhat substantial
a short acknowledgment on a man page is insufficient.  As far as I know the
only alternative to list the author of the copied code in the Author field
of your DESCRIPTION file.  This makes the Author field ungainly (especially
if the copied-from package has itself a lengthy Author field) and gives
disproportionate credit to the author of the copied code over those in
Depends.

It would be good if there was a better mechanism in the DESCRIPTION file for
acknowledging authors you copied code from.  Tricky,  lots of grey areas.
On balance I think I'm in the camp of those who say ::: should allowed.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Modifying R_CheckStack for a speed increase

2007-08-29 Thread Stephen Milborrow
Greetings R developers,

R will run a little faster when executing "pure R" code if the function
R_CheckStack() is modified.

With the modification, the following code for example runs 15% faster
(compared to a virgin R-2.5.1 on my Windows XP machine):

  N = 1e7
  foo <- function(x)
  {
   for (i in 1:N)
x <- x + 1
  x
  }
  foo(0)

The crux of the modification is to change the following line in 
R_CheckStack()

  if(R_CStackLimit != -1 && usage > 0.95 * R_CStackLimit) {...

to

  if(usage > R_CStackLen) { ...

Details and modified sources can be found at
ftp://ftp.sonic.net/pub/users/milbo.

Regards,
Stephen

http://milbo.users.sonic.net

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Just-in-time compiler for R

2008-01-16 Thread Stephen Milborrow
Greetings R developers!

I have been working on a just-in-time compiler for R arithmetic expressions.
An example:


jit(1)   # turn on just-in-time compilation
for(i in 1:na)# example convolution code
for(j in 1:nb)
  ab[i + j] <- ab[i + j] + a[i] * b[j]


The loop will run about 30% faster. That's not much of a speedup, but the
code is still in early development and the figure will get much better.

If you are interested there is more information at
www.milbo.users.sonic.net/ra.




Stephen Milborrow
www.milbo.users.sonic.net

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] gctorture and proc.time (PR#10600)

2008-01-22 Thread Stephen Milborrow
I'm not sure if this is connected but in R2.6.1 do_proctime is missing some
PROTECTs. The current code is

 SEXP ans = allocVector(REALSXP, 5), nm = allocVector(STRSXP, 5);

and should be

SEXP ans, nm;
PROTECT(ans = allocVector(REALSXP, 5));
PROTECT(nm = allocVector(STRSXP, 5));

A good way of finding these missing PROTECTs is to write say 0xEE to memory
as it is freed (i.e. one must modify the source of memory.c). Then with gc
torture enabled it more obvious if  someone uses a just freed SEXP.  There
is a similar issue in a few other places which I haven't tracked down yet
which I will submit to r-bugs in due course.  These were found using the
0xEE trick and running make check-all.

Steve
www.milbo.users.sonic.net

- Original Message - 
From: "Peter Dalgaard" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: Tuesday, January 22, 2008 9:56 AM
Subject: Re: [Rd] gctorture and proc.time (PR#10600)


[EMAIL PROTECTED] wrote:
> In R version 2.6.1 (2007-11-26)
> and R version 2.6.1 Patched (2008-01-19 r44061)
> on openSUSE 10.2 (X86-64)
>
>
>> gctorture()
>> proc.time()
>>
> Error: protect(): protection stack overflow
>
> The problem with this is that then
>
> R CMD check --use-gct foo
>
> ALWAYS FAILS with
>
>
>> cat("Time elapsed: ", proc.time() - get("ptime", pos =
>> 'CheckExEnv'),"\n")
>>
> Error in proc.time() - get("ptime", pos = "CheckExEnv") :
>   non-numeric argument to binary operator
>
> This does not happen in R version 2.4.1 (2006-12-18)
>
> I was going to have my computing class try out --use-gct.
> I guess not until this is fixed.
>
>
I can reproduce this on SUSE 10.2 64 bit and Fedora 7 64 bit, but not on
SUSE 10.3 32 bit and Fedora 8 32 bit. (The OS versions are likely not
relevant, I bet it is a 64 bit issue somewhere).


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__ NOD32 2813 (20080122) Information __

This message was checked by NOD32 antivirus system.
http://www.eset.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Just-in-time compiler for R

2008-04-04 Thread Stephen Milborrow
There is a new version of the just-in-time compiler for R at
www.milbo.users.sonic.net/ra/index.html

With just-in-time compilation enabled, the convolution example from the
"Extending R" manual now runs about 30 times faster. The web page has more 
information.

Stephen Milborrow
www.milbo.users.sonic.net

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] HOW TO AVOID LOOPS

2008-04-14 Thread Stephen Milborrow

Le sam. 12 avr. à 12:47, carlos martinez a écrit :
Looking for a simple, effective a minimum execution time solution.

For a vector as:

c(0,0,1,0,1,1,1,0,0,1,1,0,1,0,1,1,1,1,1,1)

To transform it to the following vector without using any loops:

(0,0,1,0,1,2,3,0,0,1,2,0,1,0,1,2,3,4,5,6)


Here is a fast solution using the Ra just-in-time compiler
www.milbo.users.sonic.net/ra.

jit(1)
if (length(x) > 1)
   for (i in 2:length(x))
   if (x[i])
   x[i] <- x[i-1] + 1

The times in seconds for various solutions mailed to r-devel are listed
below. There is some variation between runs and with the contents of x. The
times shown are for

set.seed(1066);  x <- as.double(runif(1e6) > .5)

This was tested on a WinXP 3 GHz Pentium D with Ra 1.0.7 (based on R 2.6.2).
The code to generate these results is attached.

vin 24
greg   11
had3.9
dan1.4
dan2  1.4
jit   0.25# code is shown above, 7 secs with standard R 2.6.2>

Stephen Milborrow
www.milbo.users.sonic.net

# cm-post.R: compare solutions to the following post to
#r-devel from carlos martinez 12 apr 2008:
# Looking for a simple, effective a minimum execution time solution.
# For a vector as:
# c(0,0,1,0,1,1,1,0,0,1,1,0,1,0,1,1,1,1,1,1)
# To transform it to the following vector without using any loops:
# c(0,0,1,0,1,2,3,0,0,1,2,0,1,0,1,2,3,4,5,6)

set.seed(1066) # for reproducibility
N <- 1e6
x <- as.double(runif(N) > .5)
x[1] <- 0  # seems to be needed for fhad (and fvin?)

fvin <- function(x) {
   ind <- which(x == 0)
   unlist(lapply(mapply(seq, ind, c(tail(ind, -1) - 1, length(x))),
 function(y) cumsum(x[y])))
}
fdan <- function(x) {
d <- diff(c(0,x,0))
starts <- which(d == 1)
ends <- which(d == -1)
x[x == 1] <- unlist(lapply(ends - starts, function(n) 1:n))
x
}
fdan2 <- function(x) {
   runs <- rle(x)
   runlengths <- runs$lengths[runs$values == 1]
   x[x == 1] <- unlist(lapply(runlengths, function(n) 1:n))
   x
}
fhad <- function(x)
   unlist(lapply(split(x, cumsum(x == 0)), seq_along)) - 1

# following requires "ra" for fast times www.milbo.users.sonic.net/ra
library(jit)
fjit <- function(x) { 
   jit(1)

   if (length(x) > 1)
   for (i in 2:length(x))
   if (x[i])
   x[i] <- x[i-1] + 1
   x
}
fgreg <- function(x)
  Reduce( function(x,y) x*y + y, x, accumulate=TRUE )

fanon <- function(x)
x * unlist(lapply(rle(x)$lengths, seq))

cat("times with N =", N, "\n")
cat("dan",  system.time(ydan  <- fdan(x))[3],  "\n")
cat("dan2", system.time(ydan2 <- fdan2(x))[3], "\n")
cat("had",  system.time(yhad  <- fhad(x))[3],  "\n")
cat("vin",  system.time(yvin  <- fvin(x))[3],  "\n")
cat("jit",  system.time(yjit  <- fjit(x))[3],  "\n")
cat("greg", system.time(ygreg <- fgreg(x))[3], "\n")
# very slow cat("anon", system.time(yanon <- fanon(x))[3], "\n")

stopifnot(identical(ydan2, ydan))
stopifnot(identical(as.numeric(yhad), ydan))
stopifnot(identical(yvin, ydan))
stopifnot(identical(yjit, ydan))
stopifnot(identical(ygreg, ydan))
# stopifnot(identical(yanon, ydan))
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] 0 ^ NaN == Inf, why?

2008-10-25 Thread Stephen Milborrow
In R, 0 ^ NaN yields Inf.  I would have expected NaN or perhaps 0. Is this 
behaviour intended?

>sessionInfo()
R version 2.8.0 (2008-10-20)
i386-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United 
States.1252;LC_MONETARY=English_United 
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

Steve Milborrow
www.milbo.users.sonic.net

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] 0 ^ NaN == Inf, why?

2008-10-26 Thread Stephen Milborrow
It may be too late to make such changes now, but would it not be simpler if
R ^ and R pow (and related functions like log) when the return type is
double simply call the standard C library routine pow?  That way we would be
compatible with IEEE 754, with the fair assumption that the C library's pow
is compatible with IEEE 754 (this I believe is a fair assumption in 2008,
may not have been historically).  Having special cases in the code for x==0
etc. is unnecessarily complicated, and returning NA in some cases and NaN in
others introduces further complication. But I may be missing something.

I think this is basically what you are saying with "Other things being
equal, `^` should follow the C pow() function for numeric arguments."

When I was building my just-in-time compiler for R, I looked into some
discrepancies between R and IEEE 754.  My tests were not exhaustive because
the jit compiler in many cases uses the same C math routines as standard R,
so I didn't bother to check those for compatibility. But if you are
interested, the results are summarized in tables in the jit man page
www.milbo.users.sonic.net/ra/jit.html.

I would be prepared to sign up for doing a full test of incompatiblities 
between R math and IEEE 754, if people think time spent doing that is worth 
it.

Steve
www.milbo.users.sonic.net

- Original Message - 
From: "John Chambers" <[EMAIL PROTECTED]>
To: "Stephen Milborrow" <[EMAIL PROTECTED]>
Cc: 
Sent: Saturday, October 25, 2008 7:55 PM
Subject: Re: [Rd] 0 ^ NaN == Inf, why?


A small PS:

John Chambers wrote:
>
> Along the line, notice that both R_pow and pow give 0^0 as 1.  (Just at
> a guess, C might give 0^-0 as Inf, but I don't know how to test that in
> R.)
>
I tried a little harder, and apparently the guess is wrong.  It seems
that pow(0, -0) is 1 in C.   Would seem better to either have pow(0,0)
and pow(0,-0) both be NaN or else 1 and Inf, but ...
> John


----- Original Message - 
From: "John Chambers" <[EMAIL PROTECTED]>
To: "Stephen Milborrow" <[EMAIL PROTECTED]>
Cc: 
Subject: Re: [Rd] 0 ^ NaN == Inf, why?

Stephen Milborrow wrote:
In R, 0 ^ NaN yields Inf.  I would have expected NaN or perhaps 0. Is this
behaviour intended?

Well, it certainly follows from the implementation.  In the R_pow C routine,

double R_pow(double x, double y) /* = x ^ y */
{
if(x == 1. || y == 0.)
  return(1.);
if(x == 0.) {
  if(y > 0.) return(0.);
  /* y < 0 */ return(R_PosInf);
}

It does seem, however, that NaN is the logical result here, which I think
results from changing the implementation to:

if(x == 0.) {
  if(y > 0.) return(0.);
  else if(y < 0) return(R_PosInf);
  else return(y); /* NA or NaN, we assert */
}

Other things being equal, `^` should follow the C pow() function for numeric
arguments.  After writing pow() as an R function that calls this C function:

> pow(0,NaN)
[1] NaN
> pow(0,NA)
[1] NA
> pow(0,0)
[1] 1

The second example presumably falls out because the C function returns its
second argument if that is a NaN (which a numeric NA is, in C but not in R).
The modified code in R_pow mimics that behavior.

Along the line, notice that both R_pow and pow give 0^0 as 1.  (Just at a
guess, C might give 0^-0 as Inf, but I don't know how to test that in R.)

Other opinions?

John

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel