[Rd] namespace crash on S3method("as.ff",function)

2007-11-05 Thread Jens Oehlschlägel
Dear all,

I have defined a generic as.ff(x, ...) and a method as.ff.function(x, ...) 
which converts a standard R function x into a chunked version operating on 
large ff objects. Everything works fine, but when registering 

S3method("as.ff",function)

in NAMESPACE, the installation fails with some kind of parsing error:

  adding build stamp to DESCRIPTION
  installing NAMESPACE file and metadata
Fehler in parse(nsFile) : Unerwartetes ')' bei
348:
349: S3method("as.ff",function)
Calls:  -> parseNamespaceFile -> parse
Ausf³hrung angehalten
make[2]: *** [nmspace] Error 1
make[1]: *** [all] Error 2
make: *** [pkg-ff] Error 2
*** Installation of ff failed ***

Is this a bug? Any ideas?

Best regards


Jens


P.S. with as.ff() we can do things like
ffx <- as.ff(x) # as.ff.default() turns a standard R object into an ff object 
stored on disk
as.ff(log)(ffx) # as.ff.function() turns 'log' into a function that we can call 
on ffx: taking the log of an almost arbitrarily large object


> version
   _   
platform   i386-pc-mingw32 
arch   i386
os mingw32 
system i386, mingw32   
status 
major  2   
minor  6.0 
year   2007
month  10  
day03  
svn rev43063   
language   R   
version.string R version 2.6.0 (2007-10-03)

-- 
Pt! Schon vom neuen GMX MultiMessenger gehört?
Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] namespace crash on S3method("as.ff",function)

2007-11-05 Thread Duncan Murdoch
On 11/5/2007 7:41 AM, Jens Oehlschlägel wrote:
> Dear all,
> 
> I have defined a generic as.ff(x, ...) and a method as.ff.function(x, ...) 
> which converts a standard R function x into a chunked version operating on 
> large ff objects. Everything works fine, but when registering 
> 
> S3method("as.ff",function)
> 
> in NAMESPACE, the installation fails with some kind of parsing error:
> 
>   adding build stamp to DESCRIPTION
>   installing NAMESPACE file and metadata
> Fehler in parse(nsFile) : Unerwartetes ')' bei
> 348:
> 349: S3method("as.ff",function)
> Calls:  -> parseNamespaceFile -> parse
> Ausf³hrung angehalten
> make[2]: *** [nmspace] Error 1
> make[1]: *** [all] Error 2
> make: *** [pkg-ff] Error 2
> *** Installation of ff failed ***
> 
> Is this a bug? Any ideas?

"function" is a reserved keyword for the parser.  Even though you don't 
execute a NAMESPACE file, it's parsed by the standard parser, and that's 
causing the problem.  (You couldn't have a function call that looked 
like that without a parse error, either.)

I don't have time to explore this today, but what I'd do is try

S3method("as.ff", `function`)

or

S3method("as.ff","function")

first, and if those don't fix it and you don't get a better suggestion, 
then report this as a design bug.

Duncan Murdoch

> Best regards
> 
> 
> Jens
> 
> 
> P.S. with as.ff() we can do things like
> ffx <- as.ff(x) # as.ff.default() turns a standard R object into an ff object 
> stored on disk
> as.ff(log)(ffx) # as.ff.function() turns 'log' into a function that we can 
> call on ffx: taking the log of an almost arbitrarily large object
> 
> 
>> version
>_   
> platform   i386-pc-mingw32 
> arch   i386
> os mingw32 
> system i386, mingw32   
> status 
> major  2   
> minor  6.0 
> year   2007
> month  10  
> day03  
> svn rev43063   
> language   R   
> version.string R version 2.6.0 (2007-10-03)
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] namespace crash on S3method("as.ff",function)

2007-11-05 Thread Prof Brian Ripley

On Mon, 5 Nov 2007, "Jens Oehlschlägel" wrote:


Dear all,

I have defined a generic as.ff(x, ...) and a method as.ff.function(x, ...) 
which converts a standard R function x into a chunked version operating on 
large ff objects. Everything works fine, but when registering

S3method("as.ff",function)

in NAMESPACE, the installation fails with some kind of parsing error:

 adding build stamp to DESCRIPTION
 installing NAMESPACE file and metadata
Fehler in parse(nsFile) : Unerwartetes ')' bei
348:
349: S3method("as.ff",function)
Calls:  -> parseNamespaceFile -> parse
Ausf³hrung angehalten
make[2]: *** [nmspace] Error 1
make[1]: *** [all] Error 2
make: *** [pkg-ff] Error 2
*** Installation of ff failed ***

Is this a bug? Any ideas?


You need to quote "function".  It falls under

  (Note that variable names may be quoted, and non-standard names such as
  @code{[<-.fractions} must be.)

since reserved words cannot be standard names.  But it could stand being 
spelt out in R-exts.


There are examples in package 'utils'.

--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Should numeric()/character() etc initialize with NA instead of 0 or ""?

2007-11-05 Thread Fabian Scheipl
Wouldn't it make programming more error-resistant if vectors were
initialized with missing data, instad of zeroes or ""?

That way, if you assign values to a vector elementwise and you miss some
elements
(because their indices were not selected or because the assignment didn't
work out, see below for code examples)
this would be immediately obvious from the value of the vector elements
themselves
and programming errors would be far less easy to overlook.

e.g.

x <- numeric(n)  or
for( i in seq(along = x) )
{
  try(x[i] <- function.which.might.crash( args[i] ))
}

or

x <- numeric(n)
x[condition1] <- foo(args1)
x[condition2] <- foo(args2)
...
x[conditionN] <- foo(argsN)

will produce x without any NAs even if function.which.might.crash() actually
did crash during the loop or
if there are indices for which none of conditions 1 to N were true and you
cannot distinguish between zeroes which
are real results and zeroes that remained unchanged since initialization of
the vector.

In a sense, initializing with NAs would also be more consistent with
vector(n, mode = "list"), which produces a list of n NULL-objects.
(numeric(10) is just a wrapper for vector(10, mode="numeric"))

Let me know what you think.

Regards,
Fabian

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Should numeric()/character() etc initialize with NA instead of 0 or ""?

2007-11-05 Thread Prof Brian Ripley
On Mon, 5 Nov 2007, Fabian Scheipl wrote:

> Wouldn't it make programming more error-resistant if vectors were
> initialized with missing data, instad of zeroes or ""?

Lots of code relies on this.  It's common programming practice (and not 
just in R/S).

> That way, if you assign values to a vector elementwise and you miss some
> elements
> (because their indices were not selected or because the assignment didn't
> work out, see below for code examples)
> this would be immediately obvious from the value of the vector elements
> themselves
> and programming errors would be far less easy to overlook.

But using x <- rep(NA_real_, n) does this for you, and is much clearer to 
the reader. Using x <- numeric(n) is only appropriate if you want '0.0' 
elements.

> e.g.
>
> x <- numeric(n)  or
> for( i in seq(along = x) )
> {
>  try(x[i] <- function.which.might.crash( args[i] ))
> }
>
> or
>
> x <- numeric(n)
> x[condition1] <- foo(args1)
> x[condition2] <- foo(args2)
> ...
> x[conditionN] <- foo(argsN)
>
> will produce x without any NAs even if function.which.might.crash() actually
> did crash during the loop or
> if there are indices for which none of conditions 1 to N were true and you
> cannot distinguish between zeroes which
> are real results and zeroes that remained unchanged since initialization of
> the vector.
>
> In a sense, initializing with NAs would also be more consistent with
> vector(n, mode = "list"), which produces a list of n NULL-objects.
> (numeric(10) is just a wrapper for vector(10, mode="numeric"))
>
> Let me know what you think.
>
> Regards,
> Fabian
>
>   [[alternative HTML version deleted]]

You were specifically asked not to do that.

> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] dot in function name taken as S3 method by package check

2007-11-05 Thread Liaw, Andy
Hello everyone,

I'm trying to update the locfit package so that it passes package check
in R 2.6.0.  However, the check seems to think some of the functions
with dot in the names are S3 methods (thus warns about the format of the
\usage{} part) when they are not.  Can anyone recommend a workaround for
this?  I tried reading R-exts, but couldn't find any hint.  I'd very
much appreciate any help!

Best,
Andy

Andy Liaw, PhD
Biometrics ResearchPO Box 2000 RY33-300
Merck Research LabsRahway, NJ 07065
andy_liaw(a)merck.com  732-594-0820



--
Notice:  This e-mail message, together with any attachme...{{dropped:15}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] dot in function name taken as S3 method by package check

2007-11-05 Thread Duncan Murdoch
On 05/11/2007 3:48 PM, Liaw, Andy wrote:
> Hello everyone,
> 
> I'm trying to update the locfit package so that it passes package check
> in R 2.6.0.  However, the check seems to think some of the functions
> with dot in the names are S3 methods (thus warns about the format of the
> \usage{} part) when they are not.  Can anyone recommend a workaround for
> this?  I tried reading R-exts, but couldn't find any hint.  I'd very
> much appreciate any help!

I think you need a NAMESPACE file.  That's where you declare whether 
things are S3 methods or not.  Without a NAMESPACE, a function like 
locfit.raw could act as the locfit method for raw objects, if someone 
ever declared a locfit generic function.  Probably not what was intended.

Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] FYI: issue with arpa/inet.h on SunOS 5.9 (old gcc?)

2007-11-05 Thread Don MacQueen
This just information of my experience installing R on SunOS 5.9 
today, not a request for help.
(in case anyone cares, and if not, I apologize for the distraction)

I am building R 2.6.0 (patched; svn revision 43319, 2007-11-01) and 
encountered the problem described below.

I believe the problem is an old gcc (version 3.0.4, built some 5 
years ago), because the warnings do not occur when I specify
CC = cc
in the environment before configuring, and building R succeeds.

Hence I'm mailing to r-devel instead of r-bugs, as suggested in the 
warning messages.

I don't have much information about the cc I used (I'm not the 
sysadmin of this or any Solaris machine), other than it resides in 
/opt/SUNWspro, and appears to be part of "Sun Studio 11", whatever 
that is.


The messages from R's configure were:

configure: WARNING: arpa/inet.h: present but cannot be compiled
configure: WARNING: arpa/inet.h: check for missing prerequisite headers?
configure: WARNING: arpa/inet.h: see the Autoconf documentation
configure: WARNING: arpa/inet.h: section "Present But Cannot Be Compiled"
configure: WARNING: arpa/inet.h: proceeding with the preprocessor's result
configure: WARNING: arpa/inet.h: in the future, the compiler will 
take precedence
configure: WARNING: ## --- ##
configure: WARNING: ## Report this to [EMAIL PROTECTED] ##
configure: WARNING: ## --- ##

And then the same set of warnings for
   netdb.h
   netinet/in.h
   sys/socket.h

At the very end configure reports:

configure: WARNING: could not determine type of socket length


Then, make fails with:

In file included from /usr/include/netinet/in.h:41,
  from /usr/include/netdb.h:98,
  from ../../../R-patched/src/main/platform.c:1586:
/usr/include/sys/stream.h:307: parse error before "projid_t"
make[3]: *** [platform.o] Error 1
make[3]: Leaving directory `/apps/kosapps/R/R-2.6.0/build/src/main'
make[2]: *** [R] Error 2
make[2]: Leaving directory `/apps/kosapps/R/R-2.6.0/build/src/main'
make[1]: *** [R] Error 1
make[1]: Leaving directory `/apps/kosapps/R/R-2.6.0/build/src'
make: *** [R] Error 1

-- 
--
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] A suggestion for an amendment to tapply

2007-11-05 Thread Andrew Robinson
Dear R-developers,

when tapply() is invoked on factors that have empty levels, it returns
NA.  This behaviour is in accord with the tapply documentation, and is
reasonable in many cases.  However, when FUN is sum, it would also
seem reasonable to return 0 instead of NA, because "the sum of an
empty set is zero, by definition."

I'd like to raise a discussion of the possibility of an amendment to
tapply.

The attached patch changes the function so that it checks if there are
any empty levels, and if there are, replaces the corresponding NA
values with the result of applying FUN to the empty set.  Eg in the
case of sum, it replaces the NA with 0, whereas with mean, it replaces
the NA with NA, and issues a warning.

This change has the following advantage: tapply and sum work better
together.  Arguably, tapply and any other function that has a non-NA
response to the empty set will also work better together.
Furthermore, tapply shows a warning if FUN would normally show a
warning upon being evaluated on an empty set.  That deviates from
current behaviour, which might be bad, but also provides information
that might be useful to the user, so that would be good.

The attached script provides the new function in full, and
demonstrates its application in some simple test cases.

Best wishes,

Andrew
-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-9763
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/ 
## The new function

my.tapply <- function (X, INDEX, FUN=NULL, ..., simplify=TRUE)
{
FUN <- if (!is.null(FUN)) match.fun(FUN)
if (!is.list(INDEX)) INDEX <- list(INDEX)
nI <- length(INDEX)
namelist <- vector("list", nI)
names(namelist) <- names(INDEX)
extent <- integer(nI)
nx <- length(X)
one <- as.integer(1)
group <- rep.int(one, nx)#- to contain the splitting vector
ngroup <- one
for (i in seq.int(INDEX)) {
index <- as.factor(INDEX[[i]])
if (length(index) != nx)
stop("arguments must have same length")
namelist[[i]] <- levels(index)#- all of them, yes !
extent[i] <- nlevels(index)
group <- group + ngroup * (as.integer(index) - one)
ngroup <- ngroup * nlevels(index)
}
if (is.null(FUN)) return(group)
ans <- lapply(split(X, group), FUN, ...)
index <- as.numeric(names(ans))
if (simplify && all(unlist(lapply(ans, length)) == 1)) {
ansmat <- array(dim=extent, dimnames=namelist)
ans <- unlist(ans, recursive = FALSE)
}
else  {
ansmat <- array(vector("list", prod(extent)),
dim=extent, dimnames=namelist)
}
## old : ansmat[as.numeric(names(ans))] <- ans
names(ans) <- NULL
ansmat[index] <- ans
if (sum(table(INDEX) < 1) > 0)
ansmat[table(INDEX) < 1] <- do.call(FUN, list(c(NULL), ...)) 
ansmat
}

## Check its utility

group <- factor(c(1,1,3,3), levels=c("1","2","3"))
x <- c(1,2,3,4)

## Ok with mean?

tapply(x, group, mean)
my.tapply(x, group, mean)

## Ok with sum?

tapply(x, group, sum)
my.tapply(x, group, sum)

## Check that other arguments are carried through

x <- c(NA,2,3,10)

tapply(x, group, sum, na.rm=TRUE)
tapply(x, group, mean, na.rm=TRUE)

my.tapply(x, group, sum, na.rm=TRUE)
my.tapply(x, group, mean, na.rm=TRUE)

## Check that listed groups work ok also

group.2 <- factor(c(1,2,3,3), levels=c("1","2","3"))

tapply(x, list(group, group.2), sum, na.rm=TRUE)
tapply(x, list(group, group.2), mean, na.rm=TRUE)

my.tapply(x, list(group, group.2), sum, na.rm=TRUE)
my.tapply(x, list(group, group.2), mean, na.rm=TRUE)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] A suggestion for an amendment to tapply

2007-11-05 Thread Bill.Venables
Unfortunately I think it would break too much existing code.  tapply()
is an old function and many people have gotten used to the way it works
now.

This is not to suggest there could not be another argument added at the
end to indicate that you want the new behaviour, though.  e.g. 

tapply <- function (X, INDEX, FUN=NULL, ..., simplify=TRUE,
handle.empty.levels = FALSE) 

but this raises the question of what sort of time penalty the
modification might entail.  Probably not much for most situations, I
suppose.  (I know this argument name looks long, but you do need a
fairly specific argument name, or it will start to impinge on the ...
argument.)

Just some thoughts.

Bill Venables.

Bill Venables
CSIRO Laboratories
PO Box 120, Cleveland, 4163
AUSTRALIA
Office Phone (email preferred): +61 7 3826 7251
Fax (if absolutely necessary):  +61 7 3826 7304
Mobile: +61 4 8819 4402
Home Phone: +61 7 3286 7700
mailto:[EMAIL PROTECTED]
http://www.cmis.csiro.au/bill.venables/ 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Andrew Robinson
Sent: Tuesday, 6 November 2007 3:10 PM
To: R-Devel
Subject: [Rd] A suggestion for an amendment to tapply

Dear R-developers,

when tapply() is invoked on factors that have empty levels, it returns
NA.  This behaviour is in accord with the tapply documentation, and is
reasonable in many cases.  However, when FUN is sum, it would also
seem reasonable to return 0 instead of NA, because "the sum of an
empty set is zero, by definition."

I'd like to raise a discussion of the possibility of an amendment to
tapply.

The attached patch changes the function so that it checks if there are
any empty levels, and if there are, replaces the corresponding NA
values with the result of applying FUN to the empty set.  Eg in the
case of sum, it replaces the NA with 0, whereas with mean, it replaces
the NA with NA, and issues a warning.

This change has the following advantage: tapply and sum work better
together.  Arguably, tapply and any other function that has a non-NA
response to the empty set will also work better together.
Furthermore, tapply shows a warning if FUN would normally show a
warning upon being evaluated on an empty set.  That deviates from
current behaviour, which might be bad, but also provides information
that might be useful to the user, so that would be good.

The attached script provides the new function in full, and
demonstrates its application in some simple test cases.

Best wishes,

Andrew
-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-9763
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/ 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] FYI: issue with arpa/inet.h on SunOS 5.9 (old gcc?)

2007-11-05 Thread Prof Brian Ripley
What OS was that compiler built for?  This happened when you had a 
version of gcc built for the wrong version of the OS, as gcc captures 
system headers.  (There's a warning about that in the R-admin manual.)

The 'report to' message is autogenerated by autoconf.

SunStudio 11 is a recent version of Sun's compilers, and much to be 
preferred to gcc 3.0.4 on that platform (and probably to any version of 
gcc there).

On Mon, 5 Nov 2007, Don MacQueen wrote:

> This just information of my experience installing R on SunOS 5.9
> today, not a request for help.
> (in case anyone cares, and if not, I apologize for the distraction)
>
> I am building R 2.6.0 (patched; svn revision 43319, 2007-11-01) and
> encountered the problem described below.
>
> I believe the problem is an old gcc (version 3.0.4, built some 5
> years ago), because the warnings do not occur when I specify
>CC = cc
> in the environment before configuring, and building R succeeds.
>
> Hence I'm mailing to r-devel instead of r-bugs, as suggested in the
> warning messages.
>
> I don't have much information about the cc I used (I'm not the
> sysadmin of this or any Solaris machine), other than it resides in
> /opt/SUNWspro, and appears to be part of "Sun Studio 11", whatever
> that is.
>
>
> The messages from R's configure were:
>
> configure: WARNING: arpa/inet.h: present but cannot be compiled
> configure: WARNING: arpa/inet.h: check for missing prerequisite headers?
> configure: WARNING: arpa/inet.h: see the Autoconf documentation
> configure: WARNING: arpa/inet.h: section "Present But Cannot Be Compiled"
> configure: WARNING: arpa/inet.h: proceeding with the preprocessor's result
> configure: WARNING: arpa/inet.h: in the future, the compiler will
> take precedence
> configure: WARNING: ## --- ##
> configure: WARNING: ## Report this to [EMAIL PROTECTED] ##
> configure: WARNING: ## --- ##
>
> And then the same set of warnings for
>   netdb.h
>   netinet/in.h
>   sys/socket.h
>
> At the very end configure reports:
>
> configure: WARNING: could not determine type of socket length
>
>
> Then, make fails with:
>
> In file included from /usr/include/netinet/in.h:41,
>  from /usr/include/netdb.h:98,
>  from ../../../R-patched/src/main/platform.c:1586:
> /usr/include/sys/stream.h:307: parse error before "projid_t"
> make[3]: *** [platform.o] Error 1
> make[3]: Leaving directory `/apps/kosapps/R/R-2.6.0/build/src/main'
> make[2]: *** [R] Error 2
> make[2]: Leaving directory `/apps/kosapps/R/R-2.6.0/build/src/main'
> make[1]: *** [R] Error 1
> make[1]: Leaving directory `/apps/kosapps/R/R-2.6.0/build/src'
> make: *** [R] Error 1
>
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] A suggestion for an amendment to tapply

2007-11-05 Thread Prof Brian Ripley
On Tue, 6 Nov 2007, [EMAIL PROTECTED] wrote:

> Unfortunately I think it would break too much existing code.  tapply()
> is an old function and many people have gotten used to the way it works
> now.

It is also not necessarily desirable: FUN(numeric(0)) might be an error.
For example:

> Z <- data.frame(x=rnorm(10), f=rep(c("a", "b"), each=5))[1:5, ]
> tapply(Z$x, Z$f, sd)

but sd(numeric(0)) is an error.  (Similar things involving var are 'in the 
wild' and so would be broken.)

> This is not to suggest there could not be another argument added at the
> end to indicate that you want the new behaviour, though.  e.g.
>
> tapply <- function (X, INDEX, FUN=NULL, ..., simplify=TRUE,
> handle.empty.levels = FALSE)
>
> but this raises the question of what sort of time penalty the
> modification might entail.  Probably not much for most situations, I
> suppose.  (I know this argument name looks long, but you do need a
> fairly specific argument name, or it will start to impinge on the ...
> argument.)
>
> Just some thoughts.
>
> Bill Venables.
>
> Bill Venables
> CSIRO Laboratories
> PO Box 120, Cleveland, 4163
> AUSTRALIA
> Office Phone (email preferred): +61 7 3826 7251
> Fax (if absolutely necessary):  +61 7 3826 7304
> Mobile: +61 4 8819 4402
> Home Phone: +61 7 3286 7700
> mailto:[EMAIL PROTECTED]
> http://www.cmis.csiro.au/bill.venables/
>
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Andrew Robinson
> Sent: Tuesday, 6 November 2007 3:10 PM
> To: R-Devel
> Subject: [Rd] A suggestion for an amendment to tapply
>
> Dear R-developers,
>
> when tapply() is invoked on factors that have empty levels, it returns
> NA.  This behaviour is in accord with the tapply documentation, and is
> reasonable in many cases.  However, when FUN is sum, it would also
> seem reasonable to return 0 instead of NA, because "the sum of an
> empty set is zero, by definition."
>
> I'd like to raise a discussion of the possibility of an amendment to
> tapply.
>
> The attached patch changes the function so that it checks if there are
> any empty levels, and if there are, replaces the corresponding NA
> values with the result of applying FUN to the empty set.  Eg in the
> case of sum, it replaces the NA with 0, whereas with mean, it replaces
> the NA with NA, and issues a warning.
>
> This change has the following advantage: tapply and sum work better
> together.  Arguably, tapply and any other function that has a non-NA
> response to the empty set will also work better together.
> Furthermore, tapply shows a warning if FUN would normally show a
> warning upon being evaluated on an empty set.  That deviates from
> current behaviour, which might be bad, but also provides information
> that might be useful to the user, so that would be good.
>
> The attached script provides the new function in full, and
> demonstrates its application in some simple test cases.
>
> Best wishes,
>
> Andrew
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel