Re: [Rd] R 2.5.1 - ?factor examples, details

2007-07-03 Thread Peter Dalgaard
François Pinard wrote:
> Hi, R people.
>
> In ?factor, in the "Examples:" section, we see:
>
>   ## suppose you want "NA" as a level, and to allowing missing values.
>   (x <- factor(c(1, 2, "NA"), exclude = ""))
>   is.na(x)[2] <- TRUE
>   x  # [1] 1 NA,  used because NA is a level.
>   is.na(x)
>   # [1] FALSE  TRUE FALSE
>
> I'm a bit confused by this example, as I do not understand the point 
> being made.  Using 'exclude = ""' or not does not change the outcome.
> What is being demonstrated by this clause, here?  Isn't "NA" a mere 
> string, not really related to a missing value?
>
> It might also be some kind of linguistic problem, and I'm not a native 
> English speaker.  The "and to allowing" construct sounds strange to me.  
> I would expect either "and to allow" or "and allowing", but maybe I'm 
> plainly missing the meaning of the statement.
>
> Could this be clarified somehow?
>
>   
I think this is a relic. In the olden days, there was no such thing as a 
missing character values, and factor() would behave like

 >  (x <- factor(c(1, 2, "NA"), exclude = "NA"))
[1] 12
Levels: 1 2

...which was a pain when dealing with abbreviations for "noradrenalin", 
"North America", "New Alliance", "Neil Armstrong", etc. So character NA 
was added to R, and the example became irrelevant without anyone noticing.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] File lock mechanisms in R

2007-07-03 Thread Prof Brian Ripley
On Mon, 2 Jul 2007, Henrik Bengtsson wrote:

> Hi,
>
> is there a (cross-platform) file-locking mechanism available in R (or
> via some package)?

I don't believe there really is a cross-platform file-locking mechanism 
available to any language.  File-locking is an OS feature, and the 
semantics differ.  For those unfamiliar with this, the Wikipedia article 
is a good start (but ignores the POSIX lockf interface).

> I am looking for a way to have one R session lock a file for
> read/write access, while being updated/modified by another R session.
> This will provide me with a-poor-mans parallelization method.  It is
> ok to have so called advisory looking (as in Unix), which are
> non-mandatory to follow.  If not available, I'll use lock files, but
> there are some potential problems in creating such in an atomic way.

Depends what you mean by 'atomic'.  In R, the only way to have 
non-interruptible operations is via .Call or similar: the evaluator is 
interruptible at all times so it seems that this issue applies equally to 
all file locking from R.

> Ideally I wish to have this working on all platforms.
>
> Cheers
>
> Henrik
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] minor flaw in integrate()

2007-07-03 Thread Martin Maechler
> "DM" == Duncan Murdoch <[EMAIL PROTECTED]>
> on Mon, 02 Jul 2007 21:56:23 -0400 writes:

DM> On 28/06/2007 5:05 PM, Peter Ruckdeschel wrote:
>> Hi,
>> 
>> I noticed a minor flaw in integrate() from package stats:
>> 
>> Taking up arguments lower and upper from integrate(),
>> 
>> if (lower ==  Inf) && (upper ==  Inf)
>> 
>> or
>> 
>> if (lower == -Inf) && (upper == -Inf)
>> 
>> integrate() calculates the value for (lower==-Inf) && (upper==Inf).
>> 
>> Rather, it should return 0.

DM> Wouldn't it be better to return NA or NaN, for the same reason Inf/Inf 
DM> doesn't return 1?

DM> Duncan Murdoch

Yes indeed, I think it should return NaN.
Martin

>> 
>> Quick fix:
>> 
>> ### old code ###
>> ### [snip]
>> else {
>> if (is.na(lower) || is.na(upper))
>> stop("a limit is missing")
>> if (is.finite(lower)) {
>> inf <- 1
>> bound <- lower
>> }
>> else if (is.finite(upper)) {
>> inf <- -1
>> bound <- upper
>> }
>> else {
>> inf <- 2
>> bound <- 0
>> }
>> wk <- .External("call_dqagi", ff, rho = environment(),
>> as.double(bound), as.integer(inf), as.double(abs.tol),
>> as.double(rel.tol), limit = limit, PACKAGE = "base")
>> }
>> ### [snip]
>> 
>> ### new code  to replace the old one ###
>> 
>> ### [snip]
>> else {
>> if (is.na(lower) || is.na(upper))
>> stop("a limit is missing")
>> 
>> if (lower == upper){
>> 
>> wk <- list("value" = 0, "abs.error" = 0,
>> "subdivisions" = subdivisions,
>> "ierr" = 0 )
>> 
>> } else {
>> if (is.finite(lower)) {
>> inf <- 1
>> bound <- lower
>> }
>> else if (is.finite(upper)) {
>> inf <- -1
>> bound <- upper
>> }
>> else {
>> inf <- 2
>> bound <- 0
>> }
>> wk <- .External("call_dqagi", ff, rho = environment(),
>> as.double(bound), as.integer(inf),
>> as.double(abs.tol), as.double(rel.tol),
>> limit = limit, PACKAGE = "base")
>> 
>> }
>> }
>> ### [snip]
>> 
>> Best, Peter
>> 
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel

DM> __
DM> R-devel@r-project.org mailing list
DM> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] How to get the names of the classes exported by a specific package.

2007-07-03 Thread ernesto
Hi,

I'm writing some functions to generate Rd files for a S4 package. I want 
to have 2 character vectors with the names of the S4 classes and the 
methods exported by a package. To get the info about methods I'm using 
"getGenerics(where="package:FLCore")" however I can not find a similar 
process to get the S4 classes. Are there functions to access this 
information ?

Best and thanks

EJ

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] How to get the names of the classes exported by a specific package.

2007-07-03 Thread Martin Morgan
Hi Ernesto,

As a hack,

> library(FLCore)
Loading required package: lattice
FLCore 1.4-4 - "Golden Jackal" 
> these <- ls("package:FLCore", all.names=TRUE)
> res <- metaNameUndo(these, prefix="C")
> as.character(res)
 [1] "FLBiol""FLBiols"   "FLCatch"   "FLFleet"   "FLFleets"  "FLIndex"  
 [7] "FLIndices" "FLQuant"   "FLQuants"  "FLSR"  "FLStock"   "FLStocks" 

Martin

ernesto <[EMAIL PROTECTED]> writes:

> Hi,
>
> I'm writing some functions to generate Rd files for a S4 package. I want 
> to have 2 character vectors with the names of the S4 classes and the 
> methods exported by a package. To get the info about methods I'm using 
> "getGenerics(where="package:FLCore")" however I can not find a similar 
> process to get the S4 classes. Are there functions to access this 
> information ?
>
> Best and thanks
>
> EJ
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Forthcoming change in the API of the Matrix package

2007-07-03 Thread Douglas Bates
Martin and I will soon release a new version of the Matrix package
with a modified API.  This will affect the authors of any packages
that use calls to the C function R_GetCCallable to directly access C
functions in the DLL or shared object object in the libs directory of
the Matrix package.  (If you didn't understand that last sentence,
relax - it means that you can ignore this message.)

We strongly suspect that I am the only such author (this mechanism is
used in the lme4 package) and, because I was the one who made the API
change, I do indeed know about it.  However, if others do use this
mechanism for, say, accessing functions in the CHOLMOD sparse matrix C
library, you should be aware of this.

The current version of the Matrix package is 0.99875-3.  This version
exports the C functions according to the old API.  The next version
will be 0.999375-0 using the new API.  I will soon upload version
0.99875-3 of the lme4 package that depends on

Matrix(<= 0.99875-3)

Version 0.999375-0 of the lme4 package will depend on

Matrix(>= 0.999375-0)

The changes in the API are in the functions as_cholmod_sparse,
as_cholmod_dense and as_cholmod_factor.  After the change the first
argument will be a pointer to a struct of the appropriate return type
(i.e. the first argument in as_cholmod_sparse is a cholmod_sparse *
and the second argument is an SEXP).  This allows the calling function
to handle both the allocation and the freeing of the storage for the
struct.

Also the new API provides several macros and typedefs for such
pointers to structs.

The development version of the Matrix package is available at

https://svn.R-project.org/R-packages/branches/Matrix-APIchange/

The corresponding  version of the lme4 package is at

https://svn.R-project.org/R-packages/branches/gappy-lmer/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] termplot - changes in defaults

2007-07-03 Thread John Maindonald
While termplot is under discussion, here's another proposal. I'd like to
change the default for partial.resid to TRUE, and for smooth to
panel.smooth.  I'd be surprised if those changes were to break existing
code.

John Maindonald email: [EMAIL PROTECTED]
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.


On Mon, 2 Jul 2007, [EMAIL PROTECTED] wrote:

> Precisely.  Thanks Brian.
>
> I did do something like this but not nearly so elegantly.
>
> I suggest this become the standard version in the next release.  I can't

Yes, that was the intention (to go into R-devel).
(It was also my intention to attach as plain text, but my Windows mailer
seems to have defeated that.)

> see that it can break any existing code.  It's a pity now we can't make
> ylim = "common" the default.

I suspect we could if I allow a way to get the previous behaviour
(ylim="free", I think).

Brian

> Regards,
> Bill V.
>
>
> Bill Venables
> CSIRO Laboratories
> PO Box 120, Cleveland, 4163
> AUSTRALIA
> Office Phone (email preferred): +61 7 3826 7251
> Fax (if absolutely necessary):  +61 7 3826 7304
> Mobile: +61 4 8819 4402
> Home Phone: +61 7 3286 7700
> mailto:[EMAIL PROTECTED]
> http://www.cmis.csiro.au/bill.venables/
>
> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> Sent: Monday, 2 July 2007 7:55 PM
> To: Venables, Bill (CMIS, Cleveland)
> Cc: [EMAIL PROTECTED]
> Subject: Re: [Rd] termplot with uniform y-limits
>
> Is the attached the sort of thing you are looking for?
> It allows ylim to be specified, including as "common".
>
> On Mon, 2 Jul 2007, [EMAIL PROTECTED] wrote:
>
>> Does anyone have, or has anyone ever considered making, a version of
>> 'termplot' that allows the user to specify that all plots should have
>> the same y-limits?
>>
>> This seems a natural thing to ask for, as the plots share a y-scale.
> If
>> you don't have the same y-axes you can easily misread the comparative
>> contributions of the different components.
>>
>> Notes: the current version of termplot does not allow the user to
>> specify ylim.  I checked.
>>
>>   the plot tools that come with mgcv do this by default.  Thanks
>> Simon.
>>
>>
>> Bill Venables
>> CSIRO Laboratories
>> PO Box 120, Cleveland, 4163
>> AUSTRALIA
>> Office Phone (email preferred): +61 7 3826 7251
>> Fax (if absolutely necessary):  +61 7 3826 7304
>> Mobile: +61 4 8819 4402
>> Home Phone: +61 7 3286 7700
>> mailto:[EMAIL PROTECTED]
>> http://www.cmis.csiro.au/bill.venables/
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] termplot - changes in defaults

2007-07-03 Thread John Maindonald
While termplot is under discussion, here's another proposal. I'd like to
change the default for partial.resid to TRUE, and for smooth to
panel.smooth.  I'd be surprised if those changes were to break existing code.

John Maindonald email: [EMAIL PROTECTED]
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.


On Mon, 2 Jul 2007, [EMAIL PROTECTED] wrote:

> Precisely.  Thanks Brian.
>
> I did do something like this but not nearly so elegantly.
>
> I suggest this become the standard version in the next release.  I can't

Yes, that was the intention (to go into R-devel).
(It was also my intention to attach as plain text, but my Windows mailer
seems to have defeated that.)

> see that it can break any existing code.  It's a pity now we can't make
ylim = "common" the default.

I suspect we could if I allow a way to get the previous behaviour
(ylim="free", I think).

Brian

> Regards,
> Bill V.
>
>
> Bill Venables
> CSIRO Laboratories
> PO Box 120, Cleveland, 4163
> AUSTRALIA
> Office Phone (email preferred): +61 7 3826 7251
> Fax (if absolutely necessary):  +61 7 3826 7304
> Mobile: +61 4 8819 4402
> Home Phone: +61 7 3286 7700
> mailto:[EMAIL PROTECTED]
> http://www.cmis.csiro.au/bill.venables/
>
> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> Sent: Monday, 2 July 2007 7:55 PM
> To: Venables, Bill (CMIS, Cleveland)
> Cc: [EMAIL PROTECTED]
> Subject: Re: [Rd] termplot with uniform y-limits
>
> Is the attached the sort of thing you are looking for?
> It allows ylim to be specified, including as "common".
>
> On Mon, 2 Jul 2007, [EMAIL PROTECTED] wrote:
>
>> Does anyone have, or has anyone ever considered making, a version of
'termplot' that allows the user to specify that all plots should have
the same y-limits?
>>
>> This seems a natural thing to ask for, as the plots share a y-scale.
> If
>> you don't have the same y-axes you can easily misread the comparative
contributions of the different components.
>>
>> Notes: the current version of termplot does not allow the user to
specify ylim.  I checked.
>>
>>   the plot tools that come with mgcv do this by default.  Thanks
>> Simon.
>>
>>
>> Bill Venables
>> CSIRO Laboratories
>> PO Box 120, Cleveland, 4163
>> AUSTRALIA
>> Office Phone (email preferred): +61 7 3826 7251
>> Fax (if absolutely necessary):  +61 7 3826 7304
>> Mobile: +61 4 8819 4402
>> Home Phone: +61 7 3286 7700
>> mailto:[EMAIL PROTECTED]
>> http://www.cmis.csiro.au/bill.venables/
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] [OT] help in setting up a doxygen configuration file

2007-07-03 Thread Douglas Bates
I would appreciate some pointers on how to set up a doxygen
configuration file for C source code.  In particular I would like to
be able to generate a call graph.  I tend to write a lot of short
utility functions and, by the time the final design reveals itself, it
is quite possible that some of these utilities are called in only one
place.  That's not a bad thing to have happen but I would like to know
about it when it does occur.

Doxygen seems to emphasize C++ classes and I can't manage to get it to
do much with my C functions.  The package sources are available at

https://svn.R-project.org/R-packages/branches/gappy-lmer/

The doxygen configuration file is gappy-lmer/inst/doc/Doxyfile

I have written all the Javadoc-style comments in the source files such
as gappy-lmer/src/lmer.c but I can't seem to get doxygen to notice
them.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] 'inline' package update

2007-07-03 Thread Oleg Sklyar
Dear all,

the 'inline' package was updated to version 0.2.2 with the following
changes:

  - functions declared using 'cfunction' can now be saved, the code
is recompiled when the object is loaded (not yet implemented
for setCMethod)
  - full path to the R binary is used for compilation allowing for
the use of the correct R version when several are installed

The update has been submitted to CRAN and should appear shortly.
Meanwhile, the package is available from

http://www.ebi.ac.uk/~osklyar/inline

Best,
Oleg

-- 
Dr Oleg Sklyar * EBI/EMBL, Cambridge CB10 1SD, England * +44-1223-494466

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] minor flaw in integrate()

2007-07-03 Thread Peter Ruckdeschel
Thanks Martin and Duncan for your
comments,

Martin Maechler wrote:
>> "DM" == Duncan Murdoch <[EMAIL PROTECTED]>
>> on Mon, 02 Jul 2007 21:56:23 -0400 writes:
> 
> DM> On 28/06/2007 5:05 PM, Peter Ruckdeschel wrote:
> >> Hi,
> >> 
> >> I noticed a minor flaw in integrate() from package stats:
> >> 
> >> Taking up arguments lower and upper from integrate(),
> >> 
> >> if (lower ==  Inf) && (upper ==  Inf)
> >> 
> >> or
> >> 
> >> if (lower == -Inf) && (upper == -Inf)
> >> 
> >> integrate() calculates the value for (lower==-Inf) && (upper==Inf).
> >> 
> >> Rather, it should return 0.
> 
> DM> Wouldn't it be better to return NA or NaN, for the same reason 
> Inf/Inf 
> DM> doesn't return 1?
> 
> DM> Duncan Murdoch
> 
> Yes indeed, I think it should return NaN.

not quite convinced --- or more precisely:

[ Let's assume lower = upper = Inf here,
  case lower = upper = -Inf is analogue ]

I'd say it depends on whether the (Lebesgue-) integral

   integral(f, lower = , upper = Inf)

is well defined. Then, by dominated convergence, the integral should
default to 0.

But I admit that then a test

 is.finite(integrate(f, lower = , upper = Inf)$value)

would be adequate, too, which makes evaluation a little more expensive :-(

If, otoh

   integrate(f, lower = , upper = Inf)

throws an error, I agree, there should be a NaN ...
Best, Peter

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] minor flaw in integrate()

2007-07-03 Thread Martin Maechler
> "PetRd" == Peter Ruckdeschel <[EMAIL PROTECTED]>
> on Tue, 03 Jul 2007 17:26:43 +0200 writes:

PetRd> Thanks Martin and Duncan for your
PetRd> comments,

PetRd> Martin Maechler wrote:
>>> "DM" == Duncan Murdoch <[EMAIL PROTECTED]>
>>> on Mon, 02 Jul 2007 21:56:23 -0400 writes:
>> 
DM> On 28/06/2007 5:05 PM, Peter Ruckdeschel wrote:
>> >> Hi,
>> >> 
>> >> I noticed a minor flaw in integrate() from package stats:
>> >> 
>> >> Taking up arguments lower and upper from integrate(),
>> >> 
>> >> if (lower ==  Inf) && (upper ==  Inf)
>> >> 
>> >> or
>> >> 
>> >> if (lower == -Inf) && (upper == -Inf)
>> >> 
>> >> integrate() calculates the value for (lower==-Inf) && (upper==Inf).
>> >> 
>> >> Rather, it should return 0.
>> 
DM> Wouldn't it be better to return NA or NaN, for the same reason Inf/Inf 
DM> doesn't return 1?
>> 
DM> Duncan Murdoch
>> 
>> Yes indeed, I think it should return NaN.

PetRd> not quite convinced --- or more precisely:

PetRd> [ Let's assume lower = upper = Inf here,
PetRd> case lower = upper = -Inf is analogue ]

PetRd> I'd say it depends on whether the (Lebesgue-) integral

PetRd> integral(f, lower = , upper = Inf)

PetRd> is well defined. Then, by dominated convergence, the integral should
PetRd> default to 0.

PetRd> But I admit that then a test

PetRd> is.finite(integrate(f, lower = , upper = 
Inf)$value)

PetRd> would be adequate, too, which makes evaluation a little more 
expensive :-(

No, that's not the Duncan's point I agreed on.
The argument is different:

consider   Int(f, x, x^2)
   Int(f, x, 2*x)
   Int(f, x, exp(x))
etc, 
These could conceivably give very different values,
with different limits for  x --> Inf

Hence, Int(f, Inf, Inf)

is mathematically undefined, hence NaN
Martin

PetRd> If, otoh

PetRd> integrate(f, lower = , upper = Inf)

PetRd> throws an error, I agree, there should be a NaN ...
PetRd> Best, Peter

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] update() problem (was: saving objects with embedded environments)

2007-07-03 Thread Duncan Murdoch
I don't want this thread to be lost, so I've changed the subject heading.

I think there's a problem here, but the problem is that update() is 
failing, not that the environment is being unnecessarily saved.  The 
problem here is that update() has no method specific to lm objects, so 
it doesn't know about the saved environment.  I don't know what other 
kinds of objects update.default can handle properly, but if they don't 
all have terms, there should probably be a specific update.lm method 
that knows to look there.

Now it's not completely obvious what update() should do in general, e.g. 
when adding terms.  If a user asks to add a term Foo to a model, they 
may mean the Foo that is in their workspace, not the Foo that happens to 
be in the data frame but not previously included in the model.  However, 
I think in the case of an lm object produced in a function, the updating 
should be done in the environment of the function.

Duncan Murdoch

On 7/2/2007 9:27 AM, McGehee, Robert wrote:
> Thanks for this. So at the risk of treading out too deep into unfamiliar
> water, one concern is that if I run 'lm' within a function and then the
> function exits, am I still (perhaps unnecessarily) keeping a copy of the
> function environment and the associated data? It does not seem that
> 'update' even works after I exit the function, so it's not clear to me
> what help saving the environment and a copy of all of its data is (see
> example below).
> 
>> B <- data.frame(y=1:100, x=rnorm(100))
>> FUN <- function(B) lm(y ~ x, data=B)
>> m <- FUN(B)
>> rm(B)
> ## update doesn't find object 'B' in function environment (so why store
> the environment?)
>> update(m, y ~ 1)  
> Error in inherits(x, "data.frame") : object "B" not found
> 
> ## However, there is a copy of object 'B' saved anyway, even
> ## after removing it from the global environment and exiting the
> function
>> dim(get("B", envir=attr(m$terms, ".Environment")))
> [1] 100   2
> 
> For my purposes all works well now. I brought this up only as one can
> quickly run out of memory if data is unnecessarily kept around after
> with large models. Anecdotally, before isolating this issue I would
> crash my snow/MPI session (followed by R) when trying to transfer these
> 'lm' and 'lm'-like objects with embedded environments. If I did not
> distribute the processing, then I found that I would rather quickly use
> up all 24GB of my computer's memory and swap space after repeated calls.
> 
> Thanks,
> Robert
> 
> -Original Message-
> From: Roger Peng [mailto:[EMAIL PROTECTED] 
> Sent: Friday, June 29, 2007 7:44 PM
> To: McGehee, Robert
> Cc: R-devel
> Subject: Re: [Rd] saving objects with embedded environments
> 
> I believe this is intentional.  See ?serialize.  When lm() is called
> in a function, the environment is saved in case the resulting fitted
> model object needs to be updated, for example, with update().
> 
> if you don't want the linear model object, you might try just saving
> the relevant objects to a separate list rather than try to delete
> everything that is irrelevant from the 'lm' object.
> 
> -roger
> 
> On 6/28/07, McGehee, Robert <[EMAIL PROTECTED]> wrote:
>> Hello,
>> I have been running linear regressions on large data sets. As 'lm'
> saves
>> a great deal of extraneous (for me) data including the residuals,
>> fitted.values, model frame, etc., I generally set these to NULL within
>> the object before saving off the model to a file.
>>
>> In the below example, however, I have found that depending on whether
> or
>> not I run 'lm' within another function or not, the entire function
>> environment is saved off with the file. So, even while object.size and
>> all.equal report that both 'lm's are equal and of small size, one
> saves
>> as a 24MB file and the other as 646 bytes. These seems to be because
> in
>> the first example the function environment is saved in attr(x1$terms,
>> ".Environment") and takes up all 24MB of space.
>>
>> Anyway, I think this is a bug, or if nothing else very undesirable
> (that
>> an object reported to be 0.5kb takes up 24MB). There also seems to be
>> some inconsistency on how environments are saved depending on if it is
>> the global environment or not, though I'm not familiar enough with
>> environments to know if this was intentional. Comments are
> appreciated.
>>
>> Thanks,
>> Robert
>>
>> ##
>> testEq <- function(B) {
>> x <- lm(y ~ x1+x2+x3, data=B, model=FALSE)
>> x$residuals <- x$effects <- x$fitted.values <- x$qr$qr <- NULL
>> x
>> }
>>
>> N <- 90
>> B <- data.frame(y=rnorm(N)+1:N, x1=rnorm(N)+1:N, x2=rnorm(N)+1:N,
>> x3=rnorm(N)+1:N)
>> x1 <- testEq(B)
>> x2 <- lm(y ~ x1+x2+x3, data=B, model=FALSE)
>> x2$residuals <- x2$effects <- x2$fitted.values <- x2$qr$qr <- NULL
>>
>> all.equal(x1, x2) ## TRUE
>> object.size(x1)  ## 5112
>> object.size(x2)  ## 5112
>> save(x1, file="x1.RData")
>> save(x2, file="x2.RData")
>> file.

Re: [Rd] minor flaw in integrate()

2007-07-03 Thread Duncan Murdoch
On 7/3/2007 11:55 AM, Martin Maechler wrote:
>> "PetRd" == Peter Ruckdeschel <[EMAIL PROTECTED]>
>> on Tue, 03 Jul 2007 17:26:43 +0200 writes:
> 
> PetRd> Thanks Martin and Duncan for your
> PetRd> comments,
> 
> PetRd> Martin Maechler wrote:
> >>> "DM" == Duncan Murdoch <[EMAIL PROTECTED]>
> >>> on Mon, 02 Jul 2007 21:56:23 -0400 writes:
> >> 
> DM> On 28/06/2007 5:05 PM, Peter Ruckdeschel wrote:
> >> >> Hi,
> >> >> 
> >> >> I noticed a minor flaw in integrate() from package stats:
> >> >> 
> >> >> Taking up arguments lower and upper from integrate(),
> >> >> 
> >> >> if (lower ==  Inf) && (upper ==  Inf)
> >> >> 
> >> >> or
> >> >> 
> >> >> if (lower == -Inf) && (upper == -Inf)
> >> >> 
> >> >> integrate() calculates the value for (lower==-Inf) && (upper==Inf).
> >> >> 
> >> >> Rather, it should return 0.
> >> 
> DM> Wouldn't it be better to return NA or NaN, for the same reason 
> Inf/Inf 
> DM> doesn't return 1?
> >> 
> DM> Duncan Murdoch
> >> 
> >> Yes indeed, I think it should return NaN.
> 
> PetRd> not quite convinced --- or more precisely:
> 
> PetRd> [ Let's assume lower = upper = Inf here,
> PetRd> case lower = upper = -Inf is analogue ]
> 
> PetRd> I'd say it depends on whether the (Lebesgue-) integral
> 
> PetRd> integral(f, lower = , upper = Inf)
> 
> PetRd> is well defined. Then, by dominated convergence, the integral 
> should
> PetRd> default to 0.
> 
> PetRd> But I admit that then a test
> 
> PetRd> is.finite(integrate(f, lower = , upper = 
> Inf)$value)
> 
> PetRd> would be adequate, too, which makes evaluation a little more 
> expensive :-(
> 
> No, that's not the Duncan's point I agreed on.
> The argument is different:
> 
> consider   Int(f, x, x^2)
>  Int(f, x, 2*x)
>  Int(f, x, exp(x))
> etc, 
> These could conceivably give very different values,
> with different limits for  x --> Inf
> 
> Hence,   Int(f, Inf, Inf)
> 
> is mathematically undefined, hence NaN

In the case Peter was talking about, those limits would all be zero. 
But I don't think we could hope for the integrate() function in R to 
recognize integrability.  For example,

 > integrate(function(x) 1/x, 1e8, Inf)
1.396208e-05 with absolute error < 2.6e-05

where the correct answer is Inf, since the integral is divergent.

So I'd be fairly strongly opposed to returning 0.  Whether we return NaN 
or NA is harder:  I suspect the reason Inf-Inf or Inf/Inf is NaN is 
because this is handled on most platforms by the floating point hardware 
or the C run-time, rather than because we've made a deliberate decision 
for that.  Do we have other cases where NaN is used to mean "unable to 
determine the answer"?

Duncan Murdoch

> Martin
> 
> PetRd> If, otoh
> 
> PetRd> integrate(f, lower = , upper = Inf)
> 
> PetRd> throws an error, I agree, there should be a NaN ...
> PetRd> Best, Peter
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] minor flaw in integrate()

2007-07-03 Thread Prof Brian Ripley
I think throwing an error is a better solution: this is rather unlikely to 
be deliberate and returning NaN might postpone the detection too long.


On Tue, 3 Jul 2007, Duncan Murdoch wrote:

> On 7/3/2007 11:55 AM, Martin Maechler wrote:
>>> "PetRd" == Peter Ruckdeschel <[EMAIL PROTECTED]>
>>> on Tue, 03 Jul 2007 17:26:43 +0200 writes:
>>
>> PetRd> Thanks Martin and Duncan for your
>> PetRd> comments,
>>
>> PetRd> Martin Maechler wrote:
>>>>> "DM" == Duncan Murdoch <[EMAIL PROTECTED]>
>>>>> on Mon, 02 Jul 2007 21:56:23 -0400 writes:
>>>>
>> DM> On 28/06/2007 5:05 PM, Peter Ruckdeschel wrote:
>> Hi,
>>
>> I noticed a minor flaw in integrate() from package stats:
>>
>> Taking up arguments lower and upper from integrate(),
>>
>> if (lower ==  Inf) && (upper ==  Inf)
>>
>> or
>>
>> if (lower == -Inf) && (upper == -Inf)
>>
>> integrate() calculates the value for (lower==-Inf) && (upper==Inf).
>>
>> Rather, it should return 0.
>>>>
>> DM> Wouldn't it be better to return NA or NaN, for the same reason 
>> Inf/Inf
>> DM> doesn't return 1?
>>>>
>> DM> Duncan Murdoch
>>>>
>>>> Yes indeed, I think it should return NaN.
>>
>> PetRd> not quite convinced --- or more precisely:
>>
>> PetRd> [ Let's assume lower = upper = Inf here,
>> PetRd> case lower = upper = -Inf is analogue ]
>>
>> PetRd> I'd say it depends on whether the (Lebesgue-) integral
>>
>> PetRd> integral(f, lower = , upper = Inf)
>>
>> PetRd> is well defined. Then, by dominated convergence, the integral 
>> should
>> PetRd> default to 0.
>>
>> PetRd> But I admit that then a test
>>
>> PetRd> is.finite(integrate(f, lower = , upper = 
>> Inf)$value)
>>
>> PetRd> would be adequate, too, which makes evaluation a little more 
>> expensive :-(
>>
>> No, that's not the Duncan's point I agreed on.
>> The argument is different:
>>
>> consider   Int(f, x, x^2)
>> Int(f, x, 2*x)
>> Int(f, x, exp(x))
>> etc,
>> These could conceivably give very different values,
>> with different limits for  x --> Inf
>>
>> Hence,  Int(f, Inf, Inf)
>>
>> is mathematically undefined, hence NaN
>
> In the case Peter was talking about, those limits would all be zero.
> But I don't think we could hope for the integrate() function in R to
> recognize integrability.  For example,
>
> > integrate(function(x) 1/x, 1e8, Inf)
> 1.396208e-05 with absolute error < 2.6e-05
>
> where the correct answer is Inf, since the integral is divergent.
>
> So I'd be fairly strongly opposed to returning 0.  Whether we return NaN
> or NA is harder:  I suspect the reason Inf-Inf or Inf/Inf is NaN is
> because this is handled on most platforms by the floating point hardware
> or the C run-time, rather than because we've made a deliberate decision
> for that.  Do we have other cases where NaN is used to mean "unable to
> determine the answer"?
>
> Duncan Murdoch
>
>> Martin
>>
>> PetRd> If, otoh
>>
>> PetRd> integrate(f, lower = , upper = Inf)
>>
>> PetRd> throws an error, I agree, there should be a NaN ...
>> PetRd> Best, Peter
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] minor flaw in integrate()

2007-07-03 Thread Gabor Grothendieck
If integrate is changed it would be nice at the same time to make it
into an S3 generic.  deriv already is an S3 generic but strangely integrate
is not.  Ryacas provides a deriv method but for integrate Ryacas inconsistently
provides Integrate since integrate is not generic.

On 6/28/07, Peter Ruckdeschel <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I noticed a minor flaw in integrate() from package stats:
>
> Taking up arguments lower and upper from integrate(),
>
>   if (lower ==  Inf) && (upper ==  Inf)
>
>   or
>
>   if (lower == -Inf) && (upper == -Inf)
>
> integrate() calculates the value for (lower==-Inf) && (upper==Inf).
>
> Rather, it should return 0.
>
> Quick fix:
>
> ### old code ###
> ### [snip]
>else {
>if (is.na(lower) || is.na(upper))
>stop("a limit is missing")
>if (is.finite(lower)) {
>inf <- 1
>bound <- lower
>}
>else if (is.finite(upper)) {
>inf <- -1
>bound <- upper
>}
>else {
>inf <- 2
>bound <- 0
>}
>wk <- .External("call_dqagi", ff, rho = environment(),
>as.double(bound), as.integer(inf), as.double(abs.tol),
>as.double(rel.tol), limit = limit, PACKAGE = "base")
>}
> ### [snip]
>
> ### new code  to replace the old one ###
>
> ### [snip]
>else {
>if (is.na(lower) || is.na(upper))
>stop("a limit is missing")
>
>if (lower == upper){
>
>wk <- list("value" = 0, "abs.error" = 0,
>"subdivisions" = subdivisions,
>"ierr" = 0 )
>
>} else {
>if (is.finite(lower)) {
>inf <- 1
>bound <- lower
>}
>else if (is.finite(upper)) {
>inf <- -1
>bound <- upper
>}
>else {
>inf <- 2
>bound <- 0
>}
>wk <- .External("call_dqagi", ff, rho = environment(),
>as.double(bound), as.integer(inf),
>as.double(abs.tol), as.double(rel.tol),
>limit = limit, PACKAGE = "base")
>
>}
>}
> ### [snip]
>
> Best, Peter
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] reinforce library to re-load

2007-07-03 Thread Weiwei Shi
Hi,

I am wondering if there is a parameter in library() so that it can
reinforce package to be reloaded. It helps when you test your modified
package by yourself. Otherwise, my way is to re-start Rgui.

(by reading ?library, I understand this option is not implemented)
"...Both functions check and update the list of currently loaded
packages and do not reload a package which is already loaded.
(Furthermore, if the package has a name space and a name space of that
name is already loaded, they work from the existing names space rather
than reloading from the file system.)"

Thanks.

-- 
Weiwei Shi, Ph.D
Research Scientist
GeneGO, Inc.

"Did you always know?"
"No, I did not. But I believed..."
---Matrix III

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] reinforce library to re-load

2007-07-03 Thread Prof Brian Ripley
Please don't post to multiple lists: I am replying only to R-devel.

You should detach your package, and if it has a namespace unload it, 
before attempting to reload it.  Something like

detach("package:foo")
library(foo)

or

unloadNamespace("foo")  # this also detaches the package
library(foo)

If the package has a DLL, this will in general not reload that.  Now in 
quite a few cases you cannot successfully unload a DLL, but 
library.dynam.unload is provided if you want to do this (including in your 
package's .Last.lib or .onUnload hooks).

On Tue, 3 Jul 2007, Weiwei Shi wrote:

> Hi,
>
> I am wondering if there is a parameter in library() so that it can
> reinforce package to be reloaded. It helps when you test your modified
> package by yourself. Otherwise, my way is to re-start Rgui.
>
> (by reading ?library, I understand this option is not implemented)
> "...Both functions check and update the list of currently loaded
> packages and do not reload a package which is already loaded.
> (Furthermore, if the package has a name space and a name space of that
> name is already loaded, they work from the existing names space rather
> than reloading from the file system.)"
>
> Thanks.
>
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] editing pasted text from parameter list through rcompletion crashes R in Windows (PR#9775)

2007-07-03 Thread mwtoews
Full_Name: Michael Toews
Version: R 2.5.1
OS: WinXP; SP2
Submission from: (NULL) (142.58.206.114)


To reproduce this crash:
1. Start a new R session normally in Windows
2. Type (an example command): "boxplot(" without pressing enter
3. Copy this text: "c(1,2,6,4,7,3)"
4. Bring Rgui.exe back in focus, and hit "Tab" twice to activate the parameter
list (you should see "x=  ...=range=  width=" etc.)
4. Paste the text, then press the backspace key, *crash*

There several other variations to crash R similarly, such as pressing the
left-key to edit the pasted text while rcompletion is showing the parameter
list. Present behaviour for rcompletion is to remove the parameter list from the
console after typing has started; however, this list is not removed if text is
pasted, which appears to crash R if the cursor moves backwards (buffer
problems?).

Version:
 platform = i386-pc-mingw32
 arch = i386
 os = mingw32
 system = i386, mingw32
 status = 
 major = 2
 minor = 5.1
 year = 2007
 month = 06
 day = 27
 svn rev = 42083
 language = R
 version.string = R version 2.5.1 (2007-06-27)

Windows XP (build 2600) Service Pack 2.0

Locale:
LC_COLLATE=English_Canada.1252;LC_CTYPE=English_Canada.1252;LC_MONETARY=English_Canada.1252;LC_NUMERIC=C;LC_TIME=English_Canada.1252

Search Path:
 .GlobalEnv, package:stats, package:graphics, package:grDevices, package:utils,
package:datasets, package:methods, Autoloads, package:base

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel