[Rd] Strange code in `?`

2009-11-03 Thread Philippe Grosjean

Hello,

In R 2.10, looking at:

> `?`
function (e1, e2)
{
if (missing(e2)) {
type <- NULL
topicExpr <- substitute(e1)
}
else {
type <- substitute(e1)
topicExpr <- substitute(e2)
}
if (is.call(topicExpr) && topicExpr[[1L]] == "?") {
search <- TRUE
topicExpr <- topicExpr[[2L]]
if (is.call(topicExpr) && topicExpr[[1L]] == "?" && 
is.call(topicExpr[[2L]]) &&

topicExpr[[2L]][[1L]] == "?") {
cat("Contacting Delphi...")
flush.console()
Sys.sleep(2 + rpois(1, 2))
cat("the oracle is unavailable.\nWe apologize for any 
inconvenience.\n")

return(invisible())
}
}

[...]

I am especially puzzled by this part:

cat("Contacting Delphi...")
flush.console()
Sys.sleep(2 + rpois(1, 2))
cat("the oracle is unavailable.\nWe apologize for any inconvenience.\n")

We now got jokes in R code? Why not? ;-)
Best,

Philippe
--
..<°}))><
 ) ) ) ) )
( ( ( ( (Prof. Philippe Grosjean
 ) ) ) ) )
( ( ( ( (Numerical Ecology of Aquatic Systems
 ) ) ) ) )   Mons University, Belgium
( ( ( ( (
..

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Request: bring back windows chm help support (PR#14034)

2009-11-03 Thread Prof Brian Ripley
Duncan gave the definitive answer in an earlier reply: the active R 
developers are no longer willing to support CHM help.  It is not open 
for discussion, period.


But three comments to ponder (but not discuss).

(a) CHM is unusable for many of us.  A year or two ago Microsoft 
disabled it on non-local drives via a Windows patch because of 
security risks -- overnight it simply failed to work (no error, no 
warning, just no response) in our computer labs.  And this year CERT 
issued a serious advisory on the CHM compiler that Microsoft has not 
fixed (and apparently is not going to) -- so many of us are banned 
from having it on a networked machine by company policy.


(b) CHM support was in the sources at the beginning of the 2.10.0 
alpha testing.  Not one user asked for it at that point, let alone 
compiled it up and tested it.  Since no one asked for it (not what we 
had anticipated), the sources were cleaned up.


The main consultation over R development is the making available of 
development versions for users to test out.


(c) We did ask for support for cross-compliation before removing it 
(https://stat.ethz.ch/pipermail/r-devel/2009-January/051864.html): no 
one responded but two shameless users months later whinged about its 
removal on this list.  That has left a sour taste and zero enthusiasm 
for supporting things that no one is prepared even to write in about 
when asked.  Ask not what the R developers can do for you, but what 
you can do for R development (and faithful alpha/beta testing would be 
a start).


One more comment inline.

On Mon, 2 Nov 2009, John Fox wrote:


Dear Alexios and Duncan,

I think that there's one more thing to be said in favour of chm help, and
that's that its format is familiar to Windows users. I've been using html
help on Windows myself for a long time, but before R 2.10.0 recommended chm
help to new Windows users of R. That said, I expect that retaining chm help
just isn't worth the effort.

Regards,
John


-Original Message-
From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org]

On

Behalf Of alexios
Sent: November-01-09 6:05 PM
To: Duncan Murdoch
Cc: r-de...@stat.math.ethz.ch
Subject: Re: [Rd] Request: bring back windows chm help support (PR#14034)

Duncan Murdoch wrote:

On 01/11/2009 5:47 PM, alexios wrote:

Peter Ehlers wrote:

Duncan Murdoch wrote:

On 31/10/2009 6:05 PM, alex...@4dscape.com wrote:

Full_Name: alex galanos
Version: 2.10.0
OS: windows vista
Submission from: (NULL) (86.11.78.110)


I respectfully request that the chm help support for windows, which
was very
convenient, be reinstated...couldn't an online poll have been
conducted to gauge
the support of this format by window's users?

First, I don't think that complaints are bugs.
Secondly, why not give the new format a chance. Personally, I
like it. Thanks, Duncan.

 -Peter Ehlers


It was not a complaint but a simple request, which given the presence
of a wishlist subdirectory I thought was appropriate to post.
Apologies if it came across as such.


It did, because you did not follow the instructions: from the FAQ:

 There is a section of the bug repository for suggestions for
 enhancements for R labelled 'wishlist'.  Suggestions can be
 submitted in the same ways as bugs, but please ensure that the
 subject line makes clear that this is for the wishlist and not a bug
 report, for example by starting with 'Wishlist:'.


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] likely bug in 'serialize' or please explain the memory usage

2009-11-03 Thread Sklyar, Oleg (London)
Hi all,

assume the following problem: a function call takes a function object
and a data variable and calls this function with this data on a remote
host. It uses serialization to pass both the function and the data via a
socket connection to a remote host. The problem is that depending on the
way we call the same construct, the function may be serialized to
include the data, which was not requested as the example below
demonstrates (runnable). This is a problem for parallel computing. The
problem described below is actually a problem for Rmpi and any other
parallel implementation we tested leading to endless executions in some
cases, where the total data passed is huge.

Assume the below 'mycall' is the function that takes data and a function
object, serializes them and calls the remote host. To make it runable I
just print the size of the serialized objects. In a parallel apply
implemention it would serialize individual list elements and a function
and pass those over. Assuming 1 element is 1Mb and having 100 elements
and a function as simple as function(z) z we would expect to pass around
100Mb of data, 1 Mb to each individual process. However what happens is
that in some situations all 100Mb of data are passed to all the slaves
as the function is serialized to include all of the data! This always
happens when we make such a call from an S4 method when the function we
is defined inline, see last example. 

Anybody can explain this, and possibly suggest a solution? Well, one is
-- do not define functions to call in the same environment as the caller
:(

I do not have immediate access to the newest version of R, so would be
grateful if sombody could test it in that and let me know if the problem
is still there. The example is runnable.

Thanks,
Oleg

Dr Oleg Sklyar
Research Technologist
AHL / Man Investments Ltd
+44 (0)20 7144 3803
oskl...@maninvestments.com


---

mycall = function(x, fun) {
FUN = serialize(fun, NULL)
DAT = serialize(x, NULL)

cat(sprintf("length FUN=%d; length DAT=%d\n", length(FUN),
length(DAT)))
invisible(NULL) ## return results of a call on a remote host with
FUN and DAN
}

## the function variant I  will be passing into mycall
innerfun = function(z) z
x = runif(1e6)

## test run from the command line
mycall(x, innerfun)
# output: length FUN=106; length DAT=822

## test run from within a function
outerfun1 = function(x) mycall(x, innerfun)
outerfun1(x)
# output: length FUN=106; length DAT=822

## test run from within a function, where function is defined within
outerfun2 = function(x) {
nestedfun = function(z) z
mycall(x, nestedfun)
}
outerfun2(x)
# output: length FUN=253; length DAT=822

setGeneric("outerfun3", function(x) standardGeneric("outerfun3"))
## define a method

## test run from within a method
setMethod("outerfun3", "numeric",
function(x) mycall(x, innerfun))
outerfun3(x)
# output@ length FUN=106; length DAT=822

## test run from within a method, where function is defined within
setMethod("outerfun3", "numeric",
function(x) {
nestedfun = function(z) z
mycall(x, nestedfun)
})
## THIS WILL BE WRONG!
outerfun3(x)
# output: length FUN=8001680; length DAT=822


--
R version 2.9.0 (2009-04-17) 
x86_64-unknown-linux-gnu 

locale:
C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base


**
 Please consider the environment before printing this email or its attachments.
The contents of this email are for the named addressees ...{{dropped:19}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] likely bug in 'serialize' or please explain the memory usage

2009-11-03 Thread Duncan Murdoch
I haven't had a chance to look really closely at this, but I would guess 
the problem is that in R functions are "closures".  The environment 
attached to the function will be serialized along with it, so if you 
have a big dataset in the same environment, you'll get that too.


I vaguely recall that the global environment and other system 
environments are handled specially, so that's not true for functions 
created at the top level, but I'd have to do some experiments to confirm.


So the solution to your problem is to pay attention to the environment 
of the functions you create.  If they need to refer to local variables 
in the creating frame, then
you'll get all of them, so be careful about what you create there.  If 
they don't need to refer to the local frame you can just attach a new 
smaller environment after building the function.


Duncan Murdoch

Sklyar, Oleg (London) wrote:

Hi all,

assume the following problem: a function call takes a function object
and a data variable and calls this function with this data on a remote
host. It uses serialization to pass both the function and the data via a
socket connection to a remote host. The problem is that depending on the
way we call the same construct, the function may be serialized to
include the data, which was not requested as the example below
demonstrates (runnable). This is a problem for parallel computing. The
problem described below is actually a problem for Rmpi and any other
parallel implementation we tested leading to endless executions in some
cases, where the total data passed is huge.

Assume the below 'mycall' is the function that takes data and a function
object, serializes them and calls the remote host. To make it runable I
just print the size of the serialized objects. In a parallel apply
implemention it would serialize individual list elements and a function
and pass those over. Assuming 1 element is 1Mb and having 100 elements
and a function as simple as function(z) z we would expect to pass around
100Mb of data, 1 Mb to each individual process. However what happens is
that in some situations all 100Mb of data are passed to all the slaves
as the function is serialized to include all of the data! This always
happens when we make such a call from an S4 method when the function we
is defined inline, see last example. 


Anybody can explain this, and possibly suggest a solution? Well, one is
-- do not define functions to call in the same environment as the caller
:(

I do not have immediate access to the newest version of R, so would be
grateful if sombody could test it in that and let me know if the problem
is still there. The example is runnable.

Thanks,
Oleg

Dr Oleg Sklyar
Research Technologist
AHL / Man Investments Ltd
+44 (0)20 7144 3803
oskl...@maninvestments.com


---

mycall = function(x, fun) {
FUN = serialize(fun, NULL)
DAT = serialize(x, NULL)

cat(sprintf("length FUN=%d; length DAT=%d\n", length(FUN),

length(DAT)))
invisible(NULL) ## return results of a call on a remote host with
FUN and DAN
}

## the function variant I  will be passing into mycall
innerfun = function(z) z
x = runif(1e6)

## test run from the command line
mycall(x, innerfun)
# output: length FUN=106; length DAT=822

## test run from within a function
outerfun1 = function(x) mycall(x, innerfun)
outerfun1(x)
# output: length FUN=106; length DAT=822

## test run from within a function, where function is defined within
outerfun2 = function(x) {
nestedfun = function(z) z
mycall(x, nestedfun)
}
outerfun2(x)
# output: length FUN=253; length DAT=822

setGeneric("outerfun3", function(x) standardGeneric("outerfun3"))
## define a method

## test run from within a method
setMethod("outerfun3", "numeric",
function(x) mycall(x, innerfun))
outerfun3(x)
# output@ length FUN=106; length DAT=822

## test run from within a method, where function is defined within
setMethod("outerfun3", "numeric",
function(x) {
nestedfun = function(z) z
mycall(x, nestedfun)
})
## THIS WILL BE WRONG!
outerfun3(x)
# output: length FUN=8001680; length DAT=822


--
R version 2.9.0 (2009-04-17) 
x86_64-unknown-linux-gnu 


locale:
C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base


**
 Please consider the environment before printing this email or its attachments.
The contents of this email are for the named addressees ...{{dropped:19}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] likely bug in 'serialize' or please explain the memory usage

2009-11-03 Thread Sklyar, Oleg (London)
Duncan,

thanks for suggestions, I will try attaching a new environment.

However this still does not explain the behaviour and does not confirm
that it is correct. What puzzles me most is that if I define a function
within another function then only the function gets serialized, yet when
this is withing an S4 method definition, then also the args. Both have
their own environments, so I do not see why it should be different. As
an interim measure I just removed all the inline function definitions
from these 'parallel' calls defining the functions as hidden outside of
the caller, a bit ugly but works. I'd be thankful if you could look at
the examples when you get some more time.

My main problem is less in ensuring that my code works, but in ensuring
that when users use these parallel functionalities with their code, they
do not get stuck in transferring data for ages simply because with every
function one gets all the data passed.

Best,
Oleg

Dr Oleg Sklyar
Research Technologist
AHL / Man Investments Ltd
+44 (0)20 7144 3803
oskl...@maninvestments.com 

> -Original Message-
> From: Duncan Murdoch [mailto:murd...@stats.uwo.ca] 
> Sent: 03 November 2009 11:59
> To: Sklyar, Oleg (London)
> Cc: r-devel@r-project.org
> Subject: Re: [Rd] likely bug in 'serialize' or please explain 
> the memory usage
> 
> I haven't had a chance to look really closely at this, but I 
> would guess 
> the problem is that in R functions are "closures".  The environment 
> attached to the function will be serialized along with it, so if you 
> have a big dataset in the same environment, you'll get that too.
> 
> I vaguely recall that the global environment and other system 
> environments are handled specially, so that's not true for functions 
> created at the top level, but I'd have to do some experiments 
> to confirm.
> 
> So the solution to your problem is to pay attention to the 
> environment 
> of the functions you create.  If they need to refer to local 
> variables 
> in the creating frame, then
> you'll get all of them, so be careful about what you create 
> there.  If 
> they don't need to refer to the local frame you can just attach a new 
> smaller environment after building the function.
> 
> Duncan Murdoch
> 
> Sklyar, Oleg (London) wrote:
> > Hi all,
> >
> > assume the following problem: a function call takes a 
> function object
> > and a data variable and calls this function with this data 
> on a remote
> > host. It uses serialization to pass both the function and 
> the data via a
> > socket connection to a remote host. The problem is that 
> depending on the
> > way we call the same construct, the function may be serialized to
> > include the data, which was not requested as the example below
> > demonstrates (runnable). This is a problem for parallel 
> computing. The
> > problem described below is actually a problem for Rmpi and any other
> > parallel implementation we tested leading to endless 
> executions in some
> > cases, where the total data passed is huge.
> >
> > Assume the below 'mycall' is the function that takes data 
> and a function
> > object, serializes them and calls the remote host. To make 
> it runable I
> > just print the size of the serialized objects. In a parallel apply
> > implemention it would serialize individual list elements 
> and a function
> > and pass those over. Assuming 1 element is 1Mb and having 
> 100 elements
> > and a function as simple as function(z) z we would expect 
> to pass around
> > 100Mb of data, 1 Mb to each individual process. However 
> what happens is
> > that in some situations all 100Mb of data are passed to all 
> the slaves
> > as the function is serialized to include all of the data! 
> This always
> > happens when we make such a call from an S4 method when the 
> function we
> > is defined inline, see last example. 
> >
> > Anybody can explain this, and possibly suggest a solution? 
> Well, one is
> > -- do not define functions to call in the same environment 
> as the caller
> > :(
> >
> > I do not have immediate access to the newest version of R, 
> so would be
> > grateful if sombody could test it in that and let me know 
> if the problem
> > is still there. The example is runnable.
> >
> > Thanks,
> > Oleg
> >
> > Dr Oleg Sklyar
> > Research Technologist
> > AHL / Man Investments Ltd
> > +44 (0)20 7144 3803
> > oskl...@maninvestments.com
> >
> > 
> --
> --
> > ---
> >
> > mycall = function(x, fun) {
> > FUN = serialize(fun, NULL)
> > DAT = serialize(x, NULL)
> > 
> > cat(sprintf("length FUN=%d; length DAT=%d\n", length(FUN),
> > length(DAT)))
> > invisible(NULL) ## return results of a call on a remote 
> host with
> > FUN and DAN
> > }
> >
> > ## the function variant I  will be passing into mycall
> > innerfun = function(z) z
> > x = runif(1e6)
> >
> > ## test run from the command line
> > mycall(x, innerfun)
> > # output: length FUN=106; length DAT=

Re: [Rd] Strange code in `?`

2009-11-03 Thread Martin Becker

I don't know if this is really a joke. It is certainly not easy to answer
  `?`(`?`(`?`(`?`(`?`
and spending some time trying to contact Delphi (maybe in order to 
permit cosmic radiation to feed the solution into the computer's RAM) is 
possibly one of the most promising approaches ;-)


Best,
 Martin

Philippe Grosjean wrote:

Hello,

In R 2.10, looking at:

> `?`
function (e1, e2)
{
if (missing(e2)) {
type <- NULL
topicExpr <- substitute(e1)
}
else {
type <- substitute(e1)
topicExpr <- substitute(e2)
}
if (is.call(topicExpr) && topicExpr[[1L]] == "?") {
search <- TRUE
topicExpr <- topicExpr[[2L]]
if (is.call(topicExpr) && topicExpr[[1L]] == "?" && 
is.call(topicExpr[[2L]]) &&

topicExpr[[2L]][[1L]] == "?") {
cat("Contacting Delphi...")
flush.console()
Sys.sleep(2 + rpois(1, 2))
cat("the oracle is unavailable.\nWe apologize for any 
inconvenience.\n")

return(invisible())
}
}

[...]

I am especially puzzled by this part:

cat("Contacting Delphi...")
flush.console()
Sys.sleep(2 + rpois(1, 2))
cat("the oracle is unavailable.\nWe apologize for any inconvenience.\n")

We now got jokes in R code? Why not? ;-)
Best,

Philippe


--
Dr. Martin Becker
Statistics and Econometrics
Saarland University
Campus C3 1, Room 206
66123 Saarbruecken
Germany

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] likely bug in 'serialize' or please explain the memory usage

2009-11-03 Thread Duncan Murdoch

On 03/11/2009 7:29 AM, Sklyar, Oleg (London) wrote:

Duncan,

thanks for suggestions, I will try attaching a new environment.

However this still does not explain the behaviour and does not confirm
that it is correct. What puzzles me most is that if I define a function
within another function then only the function gets serialized, yet when
this is withing an S4 method definition, then also the args. 



Okay, I've taken a look at your code.  I think what you're seeing is 
lazy evaluation.  S4 generics evaluate their args when they dispatch to 
a method, but normal functions don't.  So the increase from 106 bytes to 
253 bytes when the function was nested in a regular function was to hold 
the promise to evaluate x, whereas in the method, x had been evaluated 
to determine that it was numeric, and your particular method should be 
dispatched to.


So if in your nested case you add a line

force(x)

I think you'll see the size balloon up.

Now, it might be a problem that you're serializing a promise, because I 
think you'd likely get trouble with something like this:


 outerfun2 = function(x) {
 nestedfun = function() x
 mycall(x, nestedfun)
 }

If you serialize nestedfun and it only saves the promise to evaluate x, 
then unserialize it somewhere else, the promise probably won't evaluate 
to what you expected.  But you often get problems when you create 
functions that depend on unevaluated promises, and there might be a 
valid reason to want to serialize one, so I wouldn't call it a bug.


Duncan Murdoch

Both have

their own environments, so I do not see why it should be different. As
an interim measure I just removed all the inline function definitions
from these 'parallel' calls defining the functions as hidden outside of
the caller, a bit ugly but works. I'd be thankful if you could look at
the examples when you get some more time.

My main problem is less in ensuring that my code works, but in ensuring
that when users use these parallel functionalities with their code, they
do not get stuck in transferring data for ages simply because with every
function one gets all the data passed.

Best,
Oleg

Dr Oleg Sklyar
Research Technologist
AHL / Man Investments Ltd
+44 (0)20 7144 3803
oskl...@maninvestments.com 


-Original Message-
From: Duncan Murdoch [mailto:murd...@stats.uwo.ca] 
Sent: 03 November 2009 11:59

To: Sklyar, Oleg (London)
Cc: r-devel@r-project.org
Subject: Re: [Rd] likely bug in 'serialize' or please explain 
the memory usage


I haven't had a chance to look really closely at this, but I 
would guess 
the problem is that in R functions are "closures".  The environment 
attached to the function will be serialized along with it, so if you 
have a big dataset in the same environment, you'll get that too.


I vaguely recall that the global environment and other system 
environments are handled specially, so that's not true for functions 
created at the top level, but I'd have to do some experiments 
to confirm.


So the solution to your problem is to pay attention to the 
environment 
of the functions you create.  If they need to refer to local 
variables 
in the creating frame, then
you'll get all of them, so be careful about what you create 
there.  If 
they don't need to refer to the local frame you can just attach a new 
smaller environment after building the function.


Duncan Murdoch

Sklyar, Oleg (London) wrote:

Hi all,

assume the following problem: a function call takes a 

function object
and a data variable and calls this function with this data 

on a remote
host. It uses serialization to pass both the function and 

the data via a
socket connection to a remote host. The problem is that 

depending on the

way we call the same construct, the function may be serialized to
include the data, which was not requested as the example below
demonstrates (runnable). This is a problem for parallel 

computing. The

problem described below is actually a problem for Rmpi and any other
parallel implementation we tested leading to endless 

executions in some

cases, where the total data passed is huge.

Assume the below 'mycall' is the function that takes data 

and a function
object, serializes them and calls the remote host. To make 

it runable I

just print the size of the serialized objects. In a parallel apply
implemention it would serialize individual list elements 

and a function
and pass those over. Assuming 1 element is 1Mb and having 

100 elements
and a function as simple as function(z) z we would expect 

to pass around
100Mb of data, 1 Mb to each individual process. However 

what happens is
that in some situations all 100Mb of data are passed to all 

the slaves
as the function is serialized to include all of the data! 

This always
happens when we make such a call from an S4 method when the 

function we
is defined inline, see last example. 

Anybody can explain this, and possibly suggest a solution? 

Well, one is
-- do not define functions to call in the same

Re: [Rd] Request: bring back windows chm help support (PR#14034)

2009-11-03 Thread Michael Dewey

At 10:02 03/11/2009, Prof Brian Ripley wrote:

Comment in line below


Duncan gave the definitive answer in an earlier reply: the active R 
developers are no longer willing to support CHM help.  It is not 
open for discussion, period.


But three comments to ponder (but not discuss).

(a) CHM is unusable for many of us.  A year or two ago Microsoft 
disabled it on non-local drives via a Windows patch because of 
security risks -- overnight it simply failed to work (no error, no 
warning, just no response) in our computer labs.  And this year CERT 
issued a serious advisory on the CHM compiler that Microsoft has not 
fixed (and apparently is not going to) -- so many of us are banned 
from having it on a networked machine by company policy.


(b) CHM support was in the sources at the beginning of the 2.10.0 
alpha testing.  Not one user asked for it at that point, let alone 
compiled it up and tested it.  Since no one asked for it (not what 
we had anticipated), the sources were cleaned up.


The main consultation over R development is the making available of 
development versions for users to test out.


(c) We did ask for support for cross-compliation before removing it 
(https://stat.ethz.ch/pipermail/r-devel/2009-January/051864.html): 
no one responded but two shameless users months later whinged about 
its removal on this list.  That has left a sour taste and zero 
enthusiasm for supporting things that no one is prepared even to 
write in about when asked.  Ask not what the R developers can do for 
you, but what you can do for R development (and faithful alpha/beta 
testing would be a start).


What would be involved in testing such versions? Do you want people 
who are just regular users with limited computing skills like me, or 
users living on the cutting edge of computational statistics? At what 
stage in the process do pre-compiled versions (for Windows) come in? 
Is there somewhere I should have looked this up?



One more comment inline.

On Mon, 2 Nov 2009, John Fox wrote:


Dear Alexios and Duncan,

I think that there's one more thing to be said in favour of chm help, and
that's that its format is familiar to Windows users. I've been using html
help on Windows myself for a long time, but before R 2.10.0 recommended chm
help to new Windows users of R. That said, I expect that retaining chm help
just isn't worth the effort.

Regards,
John


-Original Message-
From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org]

On

Behalf Of alexios
Sent: November-01-09 6:05 PM
To: Duncan Murdoch
Cc: r-de...@stat.math.ethz.ch
Subject: Re: [Rd] Request: bring back windows chm help support (PR#14034)

Duncan Murdoch wrote:

On 01/11/2009 5:47 PM, alexios wrote:

Peter Ehlers wrote:

Duncan Murdoch wrote:

On 31/10/2009 6:05 PM, alex...@4dscape.com wrote:

Full_Name: alex galanos
Version: 2.10.0
OS: windows vista
Submission from: (NULL) (86.11.78.110)


I respectfully request that the chm help support for windows, which
was very
convenient, be reinstated...couldn't an online poll have been
conducted to gauge
the support of this format by window's users?

First, I don't think that complaints are bugs.
Secondly, why not give the new format a chance. Personally, I
like it. Thanks, Duncan.

 -Peter Ehlers

It was not a complaint but a simple request, which given the presence
of a wishlist subdirectory I thought was appropriate to post.
Apologies if it came across as such.


It did, because you did not follow the instructions: from the FAQ:

 There is a section of the bug repository for suggestions for
 enhancements for R labelled 'wishlist'.  Suggestions can be
 submitted in the same ways as bugs, but please ensure that the
 subject line makes clear that this is for the wishlist and not a bug
 report, for example by starting with 'Wishlist:'.


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Removing cran mirrors

2009-11-03 Thread Hadley Wickham
Hi all,

What's the procedure for removing out of date cran mirrors?  I've just
discovered the fantastic http://cran.r-project.org/mirmon_report.html,
but there 4 mirrors that have not been updated in over a month, and
quite a few others that have a chequered past.  Given that there are
many additional mirrors available, and the poor user experience if a
bad mirror is selected, perhaps these poorly functioning mirrors could
be removed from the list?

Hadley

-- 
http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Removing cran mirrors

2009-11-03 Thread Hadley Wickham
Reading the documentation for mirmon, it also looks like it can output
a machine readable state file.  It would be really useful if this was
published on the r-help page as then we could construct tools to
automatically recommend mirrors based on average age and download
speed.

Hadley

On Tue, Nov 3, 2009 at 8:54 AM, Hadley Wickham  wrote:
> Hi all,
>
> What's the procedure for removing out of date cran mirrors?  I've just
> discovered the fantastic http://cran.r-project.org/mirmon_report.html,
> but there 4 mirrors that have not been updated in over a month, and
> quite a few others that have a chequered past.  Given that there are
> many additional mirrors available, and the poor user experience if a
> bad mirror is selected, perhaps these poorly functioning mirrors could
> be removed from the list?
>
> Hadley
>
> --
> http://had.co.nz/
>



-- 
http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Removing cran mirrors

2009-11-03 Thread Uwe Ligges



Hadley Wickham wrote:

Reading the documentation for mirmon, it also looks like it can output
a machine readable state file.  It would be really useful if this was
published on the r-help page as then we could construct tools to
automatically recommend mirrors based on average age and download
speed.

Hadley

On Tue, Nov 3, 2009 at 8:54 AM, Hadley Wickham  wrote:

Hi all,

What's the procedure for removing out of date cran mirrors?  I've just
discovered the fantastic http://cran.r-project.org/mirmon_report.html,
but there 4 mirrors that have not been updated in over a month, and
quite a few others that have a chequered past.  Given that there are
many additional mirrors available, and the poor user experience if a
bad mirror is selected, perhaps these poorly functioning mirrors could
be removed from the list?



Fritz does that from time to time. Note that this is always rather 
cumbersome: trying to contact maintainers whose addresses do not exist 
any more and so on.


Best,
Uwe




Hadley

--
http://had.co.nz/







__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Request: bring back windows chm help support (PR#14034)

2009-11-03 Thread Duncan Murdoch

On 11/3/2009 9:49 AM, Michael Dewey wrote:

At 10:02 03/11/2009, Prof Brian Ripley wrote:

Comment in line below


Duncan gave the definitive answer in an earlier reply: the active R 
developers are no longer willing to support CHM help.  It is not 
open for discussion, period.


But three comments to ponder (but not discuss).

(a) CHM is unusable for many of us.  A year or two ago Microsoft 
disabled it on non-local drives via a Windows patch because of 
security risks -- overnight it simply failed to work (no error, no 
warning, just no response) in our computer labs.  And this year CERT 
issued a serious advisory on the CHM compiler that Microsoft has not 
fixed (and apparently is not going to) -- so many of us are banned 
from having it on a networked machine by company policy.


(b) CHM support was in the sources at the beginning of the 2.10.0 
alpha testing.  Not one user asked for it at that point, let alone 
compiled it up and tested it.  Since no one asked for it (not what 
we had anticipated), the sources were cleaned up.


The main consultation over R development is the making available of 
development versions for users to test out.


(c) We did ask for support for cross-compliation before removing it 
(https://stat.ethz.ch/pipermail/r-devel/2009-January/051864.html): 
no one responded but two shameless users months later whinged about 
its removal on this list.  That has left a sour taste and zero 
enthusiasm for supporting things that no one is prepared even to 
write in about when asked.  Ask not what the R developers can do for 
you, but what you can do for R development (and faithful alpha/beta 
testing would be a start).


What would be involved in testing such versions? Do you want people 
who are just regular users with limited computing skills like me, or 
users living on the cutting edge of computational statistics? 


I don't know what Brian would say, but I would like to see both of the 
above groups testing, and familiar with the changes that are coming. 
All you need to do is to download and install a test version and see if 
anything goes wrong on your system.


> At what
stage in the process do pre-compiled versions (for Windows) come in? 
Is there somewhere I should have looked this up?


There are announcements in the r-announce group when alpha or beta 
versions are about to be released, but you can download the r-devel 
builds any time to see what sort of things are going on, or subscribe to 
the RSS feed of NEWS changes to it.


The main problem with watching R-devel is that often it contains 
incomplete code, and some decisions aren't finalized until the end of 
the alpha testing period.  So please don't report things as bugs or 
expect everything to be in its final state, but do point out things that 
are causing problems.


Duncan Murdoch




One more comment inline.

On Mon, 2 Nov 2009, John Fox wrote:


Dear Alexios and Duncan,

I think that there's one more thing to be said in favour of chm help, and
that's that its format is familiar to Windows users. I've been using html
help on Windows myself for a long time, but before R 2.10.0 recommended chm
help to new Windows users of R. That said, I expect that retaining chm help
just isn't worth the effort.

Regards,
John


-Original Message-
From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org]

On

Behalf Of alexios
Sent: November-01-09 6:05 PM
To: Duncan Murdoch
Cc: r-de...@stat.math.ethz.ch
Subject: Re: [Rd] Request: bring back windows chm help support (PR#14034)

Duncan Murdoch wrote:

On 01/11/2009 5:47 PM, alexios wrote:

Peter Ehlers wrote:

Duncan Murdoch wrote:

On 31/10/2009 6:05 PM, alex...@4dscape.com wrote:

Full_Name: alex galanos
Version: 2.10.0
OS: windows vista
Submission from: (NULL) (86.11.78.110)


I respectfully request that the chm help support for windows, which
was very
convenient, be reinstated...couldn't an online poll have been
conducted to gauge
the support of this format by window's users?

First, I don't think that complaints are bugs.
Secondly, why not give the new format a chance. Personally, I
like it. Thanks, Duncan.

 -Peter Ehlers

It was not a complaint but a simple request, which given the presence
of a wishlist subdirectory I thought was appropriate to post.
Apologies if it came across as such.


It did, because you did not follow the instructions: from the FAQ:

 There is a section of the bug repository for suggestions for
 enhancements for R labelled 'wishlist'.  Suggestions can be
 submitted in the same ways as bugs, but please ensure that the
 subject line makes clear that this is for the wishlist and not a bug
 report, for example by starting with 'Wishlist:'.


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +4

Re: [Rd] Removing cran mirrors

2009-11-03 Thread Hadley Wickham
> Fritz does that from time to time. Note that this is always rather
> cumbersome: trying to contact maintainers whose addresses do not exist any
> more and so on.

I'd be more draconian - if the mirror doesn't update for two weeks,
just cut it off automatically.  If they want to get back on the list,
it's their problem.  It shouldn't be Fritz's problem!

Hadley


-- 
http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Removing cran mirrors

2009-11-03 Thread Uwe Ligges



Hadley Wickham wrote:

Fritz does that from time to time. Note that this is always rather
cumbersome: trying to contact maintainers whose addresses do not exist any
more and so on.


I'd be more draconian - if the mirror doesn't update for two weeks,
just cut it off automatically.  If they want to get back on the list,
it's their problem.  It shouldn't be Fritz's problem!


I fully agree with the latter. Anyway, I fear it is his problem, because 
we may also (and particularly) want to keep some mirrors in regions of 
the world where connectivity and CRAN bandwitdh is an issue.



Note that with such a policy, half CRAN packages wouldn't work anymore, 
because then we'd had "cut off" several packages away that are 
dependencies for others etc.
Examples for packages we probably should "cut off" now (since they 
already caused too much time looking at them) are:


   clusterfly, ggplot2, lvplot, plyr

all of them giving WARNINGS under R-2.10.0 and I guess you know the 
miantiner who has been asked by automatical notification to fix the issues.


Best wishes,
Uwe






Hadley




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Removing cran mirrors

2009-11-03 Thread Friedrich Leisch
> On Tue, 3 Nov 2009 09:46:11 -0600,
> Hadley Wickham (HW) wrote:

  >> Fritz does that from time to time. Note that this is always rather
  >> cumbersome: trying to contact maintainers whose addresses do not exist any
  >> more and so on.  

  > I'd be more draconian - if the mirror doesn't update for two weeks,
  > just cut it off automatically.  If they want to get back on the list,
  > it's their problem.  It shouldn't be Fritz's problem!

Well, then we would have to create "two classes" of mirrors: If ETH
Zürich goes down I'd prefer to inform Martin Mächler that there is a
problem rather than cutting off automatically (same for a lot of
others). 

But note that the list of "monitored mirrors" is a superset of the list
of "listed mirrors": The first real problem mirror.cricyt.edu.ar is not
in the list of mirrors at http://cran.r-project.org/mirrors.html (and
what you get when you are online and select a mirror via GUI). Same
for all the other very bad ones. It is a simple Flag "use mirror" in the
master table, which is used both by R itself and the scripts creating
the webpages (which run fully automated).

Best,
Fritz

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Removing cran mirrors

2009-11-03 Thread Hadley Wickham
> Note that with such a policy, half CRAN packages wouldn't work anymore,
> because then we'd had "cut off" several packages away that are dependencies
> for others etc.
> Examples for packages we probably should "cut off" now (since they already
> caused too much time looking at them) are:
>
>   clusterfly, ggplot2, lvplot, plyr
>
> all of them giving WARNINGS under R-2.10.0 and I guess you know the
> miantiner who has been asked by automatical notification to fix the issues.

Hopefully this doesn't sound too self-serving, but:

 * Keeping a mirror up-to-date is _much_ easier than keeping a package
up-to-date (i.e. you just have to run rsync regularly).

 * I don't recall receiving any automated notices about problems, and
I've just tried searching for combinations of R 2.10, warning and
ggplot2 as well as several other attempts and haven't been able to
find anything.  I normally try and keep on top of issues like this.

Hadley

-- 
http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Removing cran mirrors

2009-11-03 Thread Hadley Wickham
> Well, then we would have to create "two classes" of mirrors: If ETH
> Zürich goes down I'd prefer to inform Martin Mächler that there is a
> problem rather than cutting off automatically (same for a lot of
> others).

Oh, good point.

> But note that the list of "monitored mirrors" is a superset of the list
> of "listed mirrors": The first real problem mirror.cricyt.edu.ar is not
> in the list of mirrors at http://cran.r-project.org/mirrors.html (and
> what you get when you are online and select a mirror via GUI). Same
> for all the other very bad ones. It is a simple Flag "use mirror" in the
> master table, which is used both by R itself and the scripts creating
> the webpages (which run fully automated).

Ok, that makes sense.

But would it be possible to make available the resources of the
monitoring script in a machine readable form?  It would be very useful
to help suggest to people what mirror they should use - i.e.
automatically select one that is both fast (for them) and is updated
regularly.

Hadley

-- 
http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Standard non-standard evaluation problem with 2.10-0

2009-11-03 Thread Gavin Simpson
Dear List

I am getting an error when checking my analogue package with
R2.10.0-patched. The error comes when running a function within which I
use the standard non-standard evaluation method. I've distilled the
error and functions involved out into the following simple example to
illustrate the error:

## Dummy data to illustrate formula method
d <- data.frame(A = runif(10), B = runif(10), C = runif(10))
## simulate some missings
d[sample(10,3), 1] <- NA

foo <- function(formula, data = NULL,
subset = NULL,
na.action = na.pass, ...) {
mf <- match.call()
mf[[1]] <- as.name("model.frame")
mt <- terms(formula, data = data, simplify = TRUE)
mf[[2]] <- formula(mt, data = data)
mf$na.action <- substitute(na.action)
dots <- list(...)
mf[[names(dots)]] <- NULL
mf <- eval(mf,parent.frame())
mf
}

## apply foo using formula
foo(~ . - B, data = d, na.action = na.pass,
method = "missing", na.value = 0)
Error in mf[[names(dots)]] <- NULL : 
  more elements supplied than there are to replace

If I debug(foo) and do:

Browse[2]> names(dots)
[1] "method"   "na.value"
Browse[2]> names(mf)
[1] ""  "formula"   "data"  "na.action" "method"   
[6] "na.value"
Browse[2]> mf[[names(dots)[1]]]
[1] "missing"
Browse[2]> mf[[names(dots)[2]]]
[1] 0

But
Browse[2]> mf[[names(dots)]]
Error in mf[[names(dots)]] : subscript out of bounds
Browse[2]> str(names(dots))
 chr [1:2] "method" "na.value"

I could have sworn I tested this during the beta test phase for 2.10.0 -
if I did I didn't get any errors at that time - and this code works fine
under R2.9.x branch. The package is now failing checks on CRAN and on my
local install.

Am I doing something patently stupid here? Has something changed in '[['
or 'names' that I'm now running foul of? I can probably work round this
by setting the individual names of 'mf' to NULL in two calls, but I'd
like to get to the bottom of the problem if at all possible.

Session Info:
R version 2.10.0 Patched (2009-11-01 r50276) 
i686-pc-linux-gnu 

locale:
 [1] LC_CTYPE=en_GB.UTF-8   LC_NUMERIC=C  
 [3] LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8   LC_NAME=C 
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats graphics  grDevices utils datasets  methods  
[7] base 

other attached packages:
[1] analogue_0.6-21 MASS_7.2-48 lattice_0.17-25
[4] vegan_1.15-3   

loaded via a namespace (and not attached):
[1] grid_2.10.0  tools_2.10.0

Thanks in advance,

G
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] SOLVED: Re: Standard non-standard evaluation problem with 2.10-0

2009-11-03 Thread Gavin Simpson
Dear list,

Prof Ripley has replied with the solution - I /was/ doing something
patently stupid.

The offending line:

mf[[names(dots)]] <- NULL

should have been

mf[names(dots)] <- NULL

That the offending line worked in R 2.9.x was the result of bug, which
has been fixed in the current version, and it was my mistake in using
'[[' where I meant '['.

All the best,

Gavin

On Tue, 2009-11-03 at 20:05 +, Gavin Simpson wrote:
> Dear List
> 
> I am getting an error when checking my analogue package with
> R2.10.0-patched. The error comes when running a function within which I
> use the standard non-standard evaluation method. I've distilled the
> error and functions involved out into the following simple example to
> illustrate the error:
> 
> ## Dummy data to illustrate formula method
> d <- data.frame(A = runif(10), B = runif(10), C = runif(10))
> ## simulate some missings
> d[sample(10,3), 1] <- NA
> 
> foo <- function(formula, data = NULL,
> subset = NULL,
> na.action = na.pass, ...) {
> mf <- match.call()
> mf[[1]] <- as.name("model.frame")
> mt <- terms(formula, data = data, simplify = TRUE)
> mf[[2]] <- formula(mt, data = data)
> mf$na.action <- substitute(na.action)
> dots <- list(...)
> mf[[names(dots)]] <- NULL
> mf <- eval(mf,parent.frame())
> mf
> }
> 
> ## apply foo using formula
> foo(~ . - B, data = d, na.action = na.pass,
> method = "missing", na.value = 0)
> Error in mf[[names(dots)]] <- NULL : 
>   more elements supplied than there are to replace
> 
> If I debug(foo) and do:
> 
> Browse[2]> names(dots)
> [1] "method"   "na.value"
> Browse[2]> names(mf)
> [1] ""  "formula"   "data"  "na.action" "method"   
> [6] "na.value"
> Browse[2]> mf[[names(dots)[1]]]
> [1] "missing"
> Browse[2]> mf[[names(dots)[2]]]
> [1] 0
> 
> But
> Browse[2]> mf[[names(dots)]]
> Error in mf[[names(dots)]] : subscript out of bounds
> Browse[2]> str(names(dots))
>  chr [1:2] "method" "na.value"
> 
> I could have sworn I tested this during the beta test phase for 2.10.0 -
> if I did I didn't get any errors at that time - and this code works fine
> under R2.9.x branch. The package is now failing checks on CRAN and on my
> local install.
> 
> Am I doing something patently stupid here? Has something changed in '[['
> or 'names' that I'm now running foul of? I can probably work round this
> by setting the individual names of 'mf' to NULL in two calls, but I'd
> like to get to the bottom of the problem if at all possible.
> 
> Session Info:
> R version 2.10.0 Patched (2009-11-01 r50276) 
> i686-pc-linux-gnu 
> 
> locale:
>  [1] LC_CTYPE=en_GB.UTF-8   LC_NUMERIC=C  
>  [3] LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8
>  [5] LC_MONETARY=C  LC_MESSAGES=en_GB.UTF-8   
>  [7] LC_PAPER=en_GB.UTF-8   LC_NAME=C 
>  [9] LC_ADDRESS=C   LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C   
> 
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods  
> [7] base 
> 
> other attached packages:
> [1] analogue_0.6-21 MASS_7.2-48 lattice_0.17-25
> [4] vegan_1.15-3   
> 
> loaded via a namespace (and not attached):
> [1] grid_2.10.0  tools_2.10.0
> 
> Thanks in advance,
> 
> G
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Removing cran mirrors

2009-11-03 Thread Hadley Wickham
>>  * I don't recall receiving any automated notices about problems, and
>> I've just tried searching for combinations of R 2.10, warning and
>> ggplot2 as well as several other attempts and haven't been able to
>> find anything.  I normally try and keep on top of issues like this.
>
> Great, thank you very much for your efforts. Maybe it has been eaten by some
> spam filter (because of auto-generation?). So good chance to start working
> on the issues now. ;-)

Found the problem - it's because the subject is the same regardless of
whether or not the package builds without warnings.  I just saw the
emails and ignored them because I thought everything was ok.  Perhaps
the subject line could include WARNING if something went wrong?

Hadley


-- 
http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Registered S3 methods not found: Documentation bug or anomaly in function update or ... ?

2009-11-03 Thread Ulrike Grömping

Dear expeRts,

I recently asked for help on an issue with S3 methods for lm. The issue 
was (in DoE.base 0.9-4)
that function update from package stats would return an error whenever 
DoE.base was loaded,

complaining that lm.default was not found
(e.g.
require(DoE.base)
swiss.lm <- lm(Fertility~Education+Examination, swiss)
upd.swiss.lm <- update(swiss.lm, .~.-Examination)
).

In version 0.9-4 of DoE.base, I had followed the recommendations of 
Section 1.6.2 of "Writing R
extensions", exporting the generic function lm and registering the 
methods (lm.design and lm.default)

with S3method but not separately exporting them in the namespace file.
Not having received help fast, I decided to try to explicitly export the 
method functions

lm.design and lm.default. This did in fact remove the
issue with not finding lm.default when using function update, and I have 
uploaded this fixed version

as 0.9-5.

Is it generally advisable to also export the method functions (i.e. 
should section
1.6.2 of "Writing R extensions" be revised) ? Or is there an anomaly in 
function update ? Or ...?

Explanations are appreciated.

Thanks and regards, Ulrike

--

***
* Ulrike Groemping *
* BHT Berlin - University of Applied Sciences *
***
* +49 (30) 39404863 (Home Office) *
* +49 (30) 4504 5127 (BHT) *
***
* http://prof.tfh-berlin.de/groemping *
* groemp...@bht-berlin.de *

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Wishlist: Downgrade missing Rd links to NOTEs for non-installed Suggested packages

2009-11-03 Thread Henrik Bengtsson
Hi,

I wish to suggest that Rd cross references to help sections in
"Suggested" packages that are not installed should be reported as
NOTE:s and not WARNING:s by R CMD check.  This should only apply to
packages under Suggests: in DESCRIPTION.


RATIONALE:
1. One reason for putting a package under Suggests: is that it does
not exist on any public repositories, but still possible to install
and use.  The functions provided by such a package is not necessary by
most people, but may be used by a few who then take the effort to get
hold of and install that package.  [I don't want to discuss why such
package are not on public repositories; there are heaps of reasons
which are out of my control].

2. If option $R_check_force_suggests=false, then R CMD check will
silently accept that packages under Suggests: are not
available/installed.  This is the policy of CRAN.

3. However, R CMD check will give a WARNING for Rd cross reference to
such packages, e.g.

  * checking Rd cross-references ... WARNING
Unknown package(s) ‘JohnDoePkg’ in Rd xrefs

4. CRAN has a strict policy of not allowing any WARNINGs (or ERRORs).
Hence, you are forced to remove such Rd references.

5. The only way to get rid of such WARNINGs is to removing the Rd
reference/replacing it with plain text, e.g replace
\link[JohnDoePkg:aFcn]{aFcn} with 'aFcn()' in \pkg{JohnDoePkg}.  This
will "break" the link for those who do got around to install the
Suggested package.  I believe this is counter productive [I'll spare
you with real-life analogues] and discourage good documentation.  I
argue that it makes sense to keep *potentially* missing references,
especially now with the new dynamic help system.


Because of the above, I wish to suggest that such Rd links are
replaced by NOTEs.  NOTEs are accepted by CRAN.

/Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Help with testing (was Re: Request: bring back windows chm help support)

2009-11-03 Thread Prof Brian Ripley
Duncan has echoed my thoughts.  Just to add: Windows users also need 
to monitor the CHANGES file (also available on an RSS feed).


The things that we find hardest to check by automated testing are the 
installation process and GUI elements: we do only minimal testing in 
languages other English: Windows users routinely using test versions 
in the DBCS languages (Japanese, Korean, Simplified and Traditional 
Chinese) would be particularly beneficial, but so too would users of 
European languages on any platform.


E.g. it needs no special skills to notice that the installer gave you 
text help when you asked for HTML help.  Yet no one reported it until 
after release.


On Tue, 3 Nov 2009, Duncan Murdoch wrote:


On 11/3/2009 9:49 AM, Michael Dewey wrote:

At 10:02 03/11/2009, Prof Brian Ripley wrote:

Comment in line below

Duncan gave the definitive answer in an earlier reply: the active R 
developers are no longer willing to support CHM help.  It is not open for 
discussion, period.


But three comments to ponder (but not discuss).

(a) CHM is unusable for many of us.  A year or two ago Microsoft disabled 
it on non-local drives via a Windows patch because of security risks -- 
overnight it simply failed to work (no error, no warning, just no 
response) in our computer labs.  And this year CERT issued a serious 
advisory on the CHM compiler that Microsoft has not fixed (and apparently 
is not going to) -- so many of us are banned from having it on a networked 
machine by company policy.


(b) CHM support was in the sources at the beginning of the 2.10.0 alpha 
testing.  Not one user asked for it at that point, let alone compiled it 
up and tested it.  Since no one asked for it (not what we had 
anticipated), the sources were cleaned up.


The main consultation over R development is the making available of 
development versions for users to test out.


(c) We did ask for support for cross-compliation before removing it 
(https://stat.ethz.ch/pipermail/r-devel/2009-January/051864.html): no one 
responded but two shameless users months later whinged about its removal 
on this list.  That has left a sour taste and zero enthusiasm for 
supporting things that no one is prepared even to write in about when 
asked.  Ask not what the R developers can do for you, but what you can do 
for R development (and faithful alpha/beta testing would be a start).


What would be involved in testing such versions? Do you want people who are 
just regular users with limited computing skills like me, or users living 
on the cutting edge of computational statistics? 


I don't know what Brian would say, but I would like to see both of the above 
groups testing, and familiar with the changes that are coming. All you need 
to do is to download and install a test version and see if anything goes 
wrong on your system.



At what
stage in the process do pre-compiled versions (for Windows) come in? Is 
there somewhere I should have looked this up?


There are announcements in the r-announce group when alpha or beta versions 
are about to be released, but you can download the r-devel builds any time to 
see what sort of things are going on, or subscribe to the RSS feed of NEWS 
changes to it.


The main problem with watching R-devel is that often it contains incomplete 
code, and some decisions aren't finalized until the end of the alpha testing 
period.  So please don't report things as bugs or expect everything to be in 
its final state, but do point out things that are causing problems.


Duncan Murdoch


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] parse_Rd and/or lazyload problem

2009-11-03 Thread Mark.Bravington
> Sorry.  What I thought you said was that you had spent several hours
> on it and didn't want to spend more time on it.  I've told you I
> don't want to work on it either.  
> 
> If there is no way to trigger this bug without using internals, then
> it has not been demonstrated to be a bug in R.  It might be one, or
> it might be a bug in your code.  Often I'll work on things that are
> demonstrated bugs, but I won't commit several hours to debugging your
> code.
> 
> Duncan Murdoch
> 

I sympathize with not wanting to spend hours on other people's code-- and I 
appreciate that you have spent a lot of time off-list trying to help me with 
'parse_Rd' recently.

But in this case: 

(i) there were only 3 lines of code in the first example! If I've done 
something wrong in those 3 lines, it shouldn't take several hours to diagnose...

(ii) the real problem may well not be in 'parse_Rd' but in 'lazyLoad' etc, as 
the subject line says. Presumably you picked up the original thread because 
you're the 'parse_Rd' author. If you're sure it's not 'parse_Rd', or if you 
don't want to look at the code for other reasons, perhaps you could alert the 
author of the lazyloading routines (Luke Tierney?) to see if he's willing to 
look into it.

(iii) I deliberately haven't submitted a formal bug report, because my 
reproducible examples need to call 'makeLazyLoadDB'. (Though Henrik B is able 
to trigger the same problem without it.) As you say, by R's definition of a bug 
(which  certainly isn't the same as mine) I cannot demonstrate this is a "bug". 
So the R-bug lens may not be the correct filter for you to apply here.

Further to the problem itself: Henrik Bengtsson's report seems symptomatic of 
the same thing. I've generally hit the bug (damn!) only on the second or 
subsequent time in a session that I've lazyloaded, which is one reason it's 
hard to make reproducible. If you want a reproducible example to help track the 
bug down, then my original 3-liner would be easier to work with. However, while 
that one does reliably trigger an error on my laptop with 2GB R-usable memory, 
it doesn't on my 4GB-usable desktop. For that machine, a reproducible sequence 
with the only internal function being 'makeLazyLoadDB' is: 

file.copy( 'd:/temp/Rdiff.Rd', 'd:/temp/scrunge.Rd') # Rdiff.Rd from 'tools' 
package source

eglist <- list( scrunge=parse_Rd(  'd:/temp/scrunge.Rd'))
tools:::makeLazyLoadDB( eglist, 'd:/temp/ll')
e <- new.env()
lazyLoad( 'd:/temp/ll', e)
as.list( e) # force; OK

eglist1 <- list( scrunge=parse_Rd(  'd:/temp/Rdiff.Rd'))
tools:::makeLazyLoadDB( eglist1, 'd:/temp/ll')
e <- new.env()
lazyLoad( 'd:/temp/ll', e)
as.list( e) # Splat

It doesn't make any difference which file I process first; the error comes the 
second time round.


Mark


-- 
Mark Bravington
CSIRO Mathematical & Information Sciences
Marine Laboratory
Castray Esplanade
Hobart 7001
TAS

ph (+61) 3 6232 5118
fax (+61) 3 6232 5012
mob (+61) 438 315 623

Duncan Murdoch wrote:
> On 01/11/2009 3:12 PM, mark.braving...@csiro.au wrote:
>>> Okay, then we both agree we should drop it.
>>> Duncan Murdoch
>> 
>> 
>> No we don't. I can't provide a functioning mvbutils, or debug, until
>> this is resolved. 
>> 
>> I am trying to be a good citizen and prepare reproducible bug
>> reports-- e.g. the 3 line example. It would be quicker for me to
>> write some ugly hack that modifies base R and gets round the problem
>> *for me*, but that doesn't seem the best outcome for R. A culture
>> which discourages careful bug reporting is unhealthy culture.
> 
> Sorry.  What I thought you said was that you had spent several hours
> on it and didn't want to spend more time on it.  I've told you I
> don't want to work on it either.  
> 
> If there is no way to trigger this bug without using internals, then
> it has not been demonstrated to be a bug in R.  It might be one, or
> it might be a bug in your code.  Often I'll work on things that are
> demonstrated bugs, but I won't commit several hours to debugging your
> code.
> 
> Duncan Murdoch
> 
>> Mark Bravington
>> 
>> 
>> 
>> From: Duncan Murdoch [murd...@stats.uwo.ca]
>> Sent: 02 November 2009 01:08
>> To: Bravington, Mark (CMIS, Hobart)
>> Cc: r-devel@r-project.org
>> Subject: Re: [Rd] parse_Rd and/or lazyload problem
>> 
>> On 31/10/2009 10:18 PM, mark.braving...@csiro.au wrote:
 Does this happen in R-patched?  I've seen similar errors in 2.10.0,
 but not in a current build.
>>> Yes, still there in R-patched.
>>> 
>>> (Still haven't got to your code, this was in
 mine.  I'm reluctant to spend time on code that is messing with
 internals, because you might be using things in a way not intended
 by the author.  Now, if you can show me some code that demonstrates
 the problem without using internals directly, I'll follow up.)
>>> I did try, but it's not completely possible, because
>>> 'makeLazyLoadDB' is internal and there is no public alternat

[Rd] memory misuse in subscript code when rep() is called in odd way

2009-11-03 Thread William Dunlap
The following odd call to rep()
gives somewhat random results:

> rep(1:4, 1:8, each=2)
 [1]  1  1  1  2  2  2  2  2  2  2  3  3  3  3  3  3  3  3  3  3  3  4
4  4  4
[26]  4  4  4  4  4  4  4  4  4  4  4 NA NA NA NA NA NA NA NA NA NA NA
NA NA NA
[51] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> rep(1:4, 1:8, each=2)
Error: only 0's may be mixed with negative subscripts
> rep(1:4, 1:8, each=2)
 [1]  1  1  1  2  2  2  2  2  2  2  3  3  3  3  3  3  3  3  3  3  3  4
4  4  4
[26]  4  4  4  4  4  4  4  4  4  4  4 NA NA NA NA NA NA NA NA NA NA NA
NA NA NA
[51] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> rep(1:4, 1:8, each=2)
 [1]  1  1  1  2  2  2  2  2  2  2  3  3  3  3  3  3  3  3  3  3  3  4
4  4  4
[26]  4  4  4  4  4  4  4  4  4  4  4 NA NA NA NA NA NA NA NA NA NA NA
NA NA NA
[51] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> rep(1:4, 1:8, each=2)
 [1]  1  1  1  2  2  2  2  2  2  2  3  3  3  3  3  3  3  3  3  3  3  4
4  4  4
[26]  4  4  4  4  4  4  4  4  4  4  4 NA NA NA NA NA NA NA NA NA NA NA
NA NA NA
[51] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> rep(1:4, 1:8, each=2)
 [1]  1  1  1  2  2  2  2  2  2  2  3  3  3  3  3  3  3  3  3  3  3  4
4  4  4
[26]  4  4  4  4  4  4  4  4  4  4  4 NA NA NA NA NA NA NA NA NA NA NA
NA NA NA
[51] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA  2 NA NA  2 NA NA  2
>  version
   _
platform   i686-pc-linux-gnu
arch   i686
os linux-gnu
system i686, linux-gnu
status Under development (unstable)
major  2
minor  11.0
year   2009
month  10
day20
svn rev50178
language   R
version.string R version 2.11.0 Under development (unstable) (2009-10-20
r50178)

valgrind says that the C code is using uninitialized data:
> rep(1:4, 1:8, each=2)
==26459== Conditional jump or move depends on uninitialised value(s)
==26459==at 0x80C557D: integerSubscript (subscript.c:408)
==26459==by 0x80C5EDC: Rf_vectorSubscript (subscript.c:658)
==26459==by 0x80C5FFD: Rf_makeSubscript (subscript.c:613)
==26459==by 0x80C7368: do_subset_dflt (subset.c:158)
==26459==by 0x80B4283: do_rep (Rinlinedfuns.h:161)
==26459==by 0x816491B: Rf_eval (eval.c:464)
==26459==by 0x805A726: Rf_ReplIteration (main.c:262)
==26459==by 0x805A95E: R_ReplConsole (main.c:311)
==26459==by 0x805AFBC: run_Rmainloop (main.c:964)
==26459==by 0x8058E2B: main (Rmain.c:33)
==26459==
==26459== Conditional jump or move depends on uninitialised value(s)
==26459==at 0x80C5567: integerSubscript (subscript.c:409)
==26459==by 0x80C5EDC: Rf_vectorSubscript (subscript.c:658)
==26459==by 0x80C5FFD: Rf_makeSubscript (subscript.c:613)
==26459==by 0x80C7368: do_subset_dflt (subset.c:158)
==26459==by 0x80B4283: do_rep (Rinlinedfuns.h:161)
==26459==by 0x816491B: Rf_eval (eval.c:464)
==26459==by 0x805A726: Rf_ReplIteration (main.c:262)
==26459==by 0x805A95E: R_ReplConsole (main.c:311)
==26459==by 0x805AFBC: run_Rmainloop (main.c:964)
==26459==by 0x8058E2B: main (Rmain.c:33)
==26459==
==26459== Conditional jump or move depends on uninitialised value(s)
==26459==at 0x80C556E: integerSubscript (subscript.c:411)
==26459==by 0x80C5EDC: Rf_vectorSubscript (subscript.c:658)
==26459==by 0x80C5FFD: Rf_makeSubscript (subscript.c:613)
==26459==by 0x80C7368: do_subset_dflt (subset.c:158)
==26459==by 0x80B4283: do_rep (Rinlinedfuns.h:161)
==26459==by 0x816491B: Rf_eval (eval.c:464)
==26459==by 0x805A726: Rf_ReplIteration (main.c:262)
==26459==by 0x805A95E: R_ReplConsole (main.c:311)
==26459==by 0x805AFBC: run_Rmainloop (main.c:964)
==26459==by 0x8058E2B: main (Rmain.c:33)
==26459==
==26459== Conditional jump or move depends on uninitialised value(s)
==26459==at 0x80C558F: integerSubscript (subscript.c:415)
==26459==by 0x80C5EDC: Rf_vectorSubscript (subscript.c:658)
==26459==by 0x80C5FFD: Rf_makeSubscript (subscript.c:613)
==26459==by 0x80C7368: do_subset_dflt (subset.c:158)
==26459==by 0x80B4283: do_rep (Rinlinedfuns.h:161)
==26459==by 0x816491B: Rf_eval (eval.c:464)
==26459==by 0x805A726: Rf_ReplIteration (main.c:262)
==26459==by 0x805A95E: R_ReplConsole (main.c:311)
==26459==by 0x805AFBC: run_Rmainloop (main.c:964)
==26459==by 0x8058E2B: main (Rmain.c:33)
==26459==
==26459== Conditional jump or move depends on uninitialised value(s)
==26459==at 0x80C55C1: integerSubscript (subscript.c:387)
==26459==by 0x80C5EDC: Rf_vectorSubscript (subscript.c:658)
==26459==by 0x80C5FFD: Rf_makeSubscript (subscript.c:613)
==26459==by 0x80C7368: do_subset_dflt (subset.c:158)
==26459==by 0x80B4283: do_rep (Rinlinedfuns.h:161)
==26459==by 0x816491B: Rf_eval (eval.c:464)
==26459==by 0x805A726: Rf_ReplIteration (main.c:262)
==26459==by 0x805A95E: R_ReplConsole (main.c:311)
==26459==by 0x805AFBC: run_Rma

Re: [Rd] Removing cran mirrors

2009-11-03 Thread Uwe Ligges



Hadley Wickham wrote:

Note that with such a policy, half CRAN packages wouldn't work anymore,
because then we'd had "cut off" several packages away that are dependencies
for others etc.
Examples for packages we probably should "cut off" now (since they already
caused too much time looking at them) are:

  clusterfly, ggplot2, lvplot, plyr

all of them giving WARNINGS under R-2.10.0 and I guess you know the
miantiner who has been asked by automatical notification to fix the issues.


Hopefully this doesn't sound too self-serving, but:

 * Keeping a mirror up-to-date is _much_ easier than keeping a package
up-to-date (i.e. you just have to run rsync regularly).


Sure, I know.



 * I don't recall receiving any automated notices about problems, and
I've just tried searching for combinations of R 2.10, warning and
ggplot2 as well as several other attempts and haven't been able to
find anything.  I normally try and keep on top of issues like this.


Great, thank you very much for your efforts. Maybe it has been eaten by 
some spam filter (because of auto-generation?). So good chance to start 
working on the issues now. ;-)


Best wishes,
Uwe






Hadley



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Removing cran mirrors

2009-11-03 Thread Uwe Ligges



Hadley Wickham wrote:

 * I don't recall receiving any automated notices about problems, and
I've just tried searching for combinations of R 2.10, warning and
ggplot2 as well as several other attempts and haven't been able to
find anything.  I normally try and keep on top of issues like this.

Great, thank you very much for your efforts. Maybe it has been eaten by some
spam filter (because of auto-generation?). So good chance to start working
on the issues now. ;-)


Found the problem - it's because the subject is the same regardless of
whether or not the package builds without warnings.  I just saw the
emails and ignored them because I thought everything was ok.  Perhaps
the subject line could include WARNING if something went wrong?



OK, good point, will change before the next release.

Best wishes,
Uwe


Hadley




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] memory misuse in subscript code when rep() is called in odd way

2009-11-03 Thread Seth Falcon

Hi,

On 11/3/09 2:28 PM, William Dunlap wrote:

The following odd call to rep()
gives somewhat random results:


rep(1:4, 1:8, each=2)


I've committed a fix for this to R-devel.

I admit that I had to reread the rep man page as I first thought this 
was not a valid call to rep since times (1:8) is longer than x (1:4), 
but closer reading of the man page says:


  > If times is a vector of the same length as x (after replication
  > by each), the result consists of x[1] repeated times[1] times,
  > x[2] repeated times[2] times and so on.

So the expected result is the same as rep(rep(1:4, each=2), 1:8).


valgrind says that the C code is using uninitialized data:

rep(1:4, 1:8, each=2)

==26459== Conditional jump or move depends on uninitialised value(s)
==26459==at 0x80C557D: integerSubscript (subscript.c:408)
==26459==by 0x80C5EDC: Rf_vectorSubscript (subscript.c:658)


A little investigation seems to suggest that the problem is originating 
earlier.  Debugging in seq.c:do_rep I see the following:


> rep(1:4, 1:8, each=2)

Breakpoint 1, do_rep (call=0x102de0068, op=unavailable, due to optimizations>, args=due to optimizations>, rho=0x1018829f0) at 
/Users/seth/src/R-devel-all/src/main/seq.c:434
434 ans = do_subset_dflt(R_NilValue, R_NilValue, list2(x, ind), 
rho);

(gdb) p Rf_PrintValue(ind)
 [1]  1  1  1  2  2  2
 [7]  2  2  2  2  3  3
[13]  3  3  3  3  3  3
[19]  3  3  3  4  4  4
[25]  4  4  4  4  4  4
[31]  4  4  4  4  4  4
[37]   44129344  1   44129560  1   44129776  1
[43]   44129992  1   44099592  1   44099808  1
[49]   44100024  1   44100456  127241443801089
[55] -536870733  0   54857992  1   22275728  1
[61]2724144  1 34  1   44100744  1
[67]   44100960  1   44101176  1   43652616  1
$2 = void
(gdb) c
Continuing.
Error: only 0's may be mixed with negative subscripts

The patch I applied adjusts how the index vector length is computed when 
times has length more than one.


+ seth

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] parse_Rd and/or lazyload problem

2009-11-03 Thread Seth Falcon

Hi,

On 11/3/09 6:51 PM, mark.braving...@csiro.au wrote:


file.copy( 'd:/temp/Rdiff.Rd', 'd:/temp/scrunge.Rd') # Rdiff.Rd from 'tools' 
package source

eglist<- list( scrunge=parse_Rd(  'd:/temp/scrunge.Rd'))
tools:::makeLazyLoadDB( eglist, 'd:/temp/ll')
e<- new.env()
lazyLoad( 'd:/temp/ll', e)
as.list( e) # force; OK

eglist1<- list( scrunge=parse_Rd(  'd:/temp/Rdiff.Rd'))
tools:::makeLazyLoadDB( eglist1, 'd:/temp/ll')
e<- new.env()
lazyLoad( 'd:/temp/ll', e)
as.list( e) # Splat

It doesn't make any difference which file I process first; the error comes the 
second time round.


If I adjust this example in terms of paths and run on OS X, I get the 
following error on the second run:


> as.list(e) # Splat
Error in as.list.environment(e) : internal error -3 in R_decompress1

I haven't looked further yet.

+ seth

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel