from:"Roger D. Peng"

Re: [Rd] Citation for R

2005-06-13 Thread Roger D. Peng

It's not clear to me why a journal publication is necessary
(although I guess it couldn't hurt).  What do you cite when you
use SAS?  Or Stata?

-roger

Gordon K Smyth wrote:
> This is just a note that R would get a lot more citations if
> the recommended citation was an article in a recognised
> journal or from a recognised publisher.
> 
> I use R in work leading to publications often, and I strongly
> want to give the R core team credit for their work.  However I
> find that I can't persuade my biological collaborators to
> include the current R citation (below) in their reference
> lists, because it is not an article in a recognised journal
> nor from a recognised publisher.  I can cite the 1996 paper by
> Ihaka and Gentleman, and sometimes this what I do, but I'd
> really like to give credit to the other R core members as
> well, for example the CRAN people and those involved in the
> Windows version.
> 
> I know this is more work for the R team, like everything else,
> but an article on the story of R since the creation of the
> core team would be really nice to see.
> 
> 
>> citation()
> 
> 
> To cite R in publications use:
> 
> R Development Core Team (2005). R: A language and environment
> for statistical computing. R Foundation for Statistical
> Computing, Vienna, Austria. ISBN 3-900051-07-0, URL
> http://www.R-project.org.
> 
> 
> Gordon
> 
> __ 
> R-devel@r-project.org mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] capabilities() and non-catchable messages

2005-06-20 Thread Roger D. Peng

Would using 'capture.output()' work for you in this case?

-roger

Henrik Bengtsson wrote:
> Just for the record (not a request for fix) and an ad hoc workaround if 
> anyone needs it:
> 
> REASON:
> Running an R script as a plugin on a remote Suse Linux 8.2 with R v2.1.0 
> (2005-04-18), I have noticed that capabilities() generates (to standard 
> error)
> 
>Xlib: connection to "base:0.0" refused by server
>Xlib: Client is not authorized to connect to Server
> 
> which cannot be caught by tryCatch();
> 
>tryCatch({
>  print(capabilities());
>}, condition=function(c) {
>  cat("Condition caught:\n");
>  str(c);
>})
> 
> because it is not a 'condition' (error or warning).
> 
> CONTEXT:
> Since source() calls capabilities("iconv") this messages always show up. 
> My R plugin loads custom code using source() and since the standard 
> error from the plugin is checked for messages, the host system 
> interprets this as if something problematic has occured.
> 
> WORKAROUND:
> The workaround that I use now is to redefine capabilities() temporarily 
> (since I do not need "iconv" support):
> 
>   orgCapabilities <- base::capabilities;
>   basePos <- which(search() == "package:base"));
>   assign("capabilities", function(...) FALSE, pos=basePos);
> 
>   source()
> 
>   basePos <- which(search() == "package:base"));
>   assign("capabilities", orgCapabilities, pos=basePos);
>   rm(orgCapabilities)
> 
> Cheers
> 
> Henrik
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] object.size() bug?

2005-08-05 Thread Roger D. Peng

Would it make sense for 'object.size()' to do the same thing for 
external pointers as it does for environments?

-roger

Martin Maechler wrote:
>>>>>>"Paul" == Paul Roebuck <[EMAIL PROTECTED]>
>>>>>>on Thu, 4 Aug 2005 00:29:03 -0500 (CDT) writes:
> 
> 
> Paul> Can someone confirm the following as a problem:
> 
> Yes, I can.  No promiss for a fix in the very near future
> though.
> 
> Martin Maechler, ETH Zurich
> 
> 
>>>Can someone confirm the following as a problem:
>>>
>>>R> setClass("Foo", representation(.handle = "externalptr"))
>>>R> object.size(new("Foo"))
>>>Error in object.size(new("Foo")) : object.size: unknown type 22
>>>R> R.version.string
>>>[1] "R version 2.1.1, 2005-06-20"
>>>
>>>R-2.1.1/src/include/Rinternals.h
>>>#define EXTPTRSXP   22/* external pointer */
>>>
>>>R-2.1.1/src/main/size.c:
>>>objectsize(SEXP s) has no case for external pointers
> 
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

-- 
Roger D. Peng
http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] lattice and for loop

2005-09-02 Thread Roger D. Peng

See 
http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-do-lattice_002ftrellis-graphics-not-work_003f

-roger

Charles Geyer wrote:
> - Forwarded message from Sandy Weisberg <[EMAIL PROTECTED]> -
> 
> OK, here is my R bug:
> 
> library(lattice)
> x <- rnorm(20)
> y <- rnorm(20)
> z <-rep(c(1,2),10)
> xyplot(y~x|z)
> # the above works fine.  Now try this:
> 
> for (j in 1:1) {xyplot(y~x|z)}
> 
> # no graph is produced.
> 

-- 
Roger D. Peng
http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Shy Suggestion?

2005-09-20 Thread Roger D. Peng

I think this needs to fail because packages listed in 'Suggests:' may, for 
example, be needed in the examples.  How can 'R CMD check' run the examples and 
verify that they are executable if those packages are not available?  I suppose 
you could put the examples in a \dontrun{}.

-roger

Jari Oksanen wrote:
> The R-exts manual says about 'Suggests' field in package DESCRIPTION:
> 
> "The optional `Suggests' field uses the same syntax as `Depends' and
> lists packages that are not necessarily needed."
> 
> However, this seems to be a suggestion you cannot refuse. If you suggest
> packages:
> 
> (a line from DESCRIPTION):
> Suggests: MASS, ellipse, rgl, mgcv, akima, lattice
> 
> This is what happens:
> 
> $ /tmp/R-alpha/bin/R CMD check vegan
> * checking for working latex ... OK
> * using log directory '/home/jarioksa/devel/R/vegan.Rcheck'
> * using R version 2.2.0, 2005-09-19
> * checking for file 'vegan/DESCRIPTION' ... OK
> * this is package 'vegan' version '1.7-75'
> ... clip ...
> * checking package dependencies ... ERROR
> Packages required but not available:
>   ellipse rgl akima
> 
> In my cultural context suggesting a package means that it is not
> necessarily needed and the check should not fail, although some
> functionality would be unavailable without those packages.  I want the
> package to pass the tests in a clean standard environment without
> forcing anybody to load any extra packages. Is there a possibility to be
> modest and shy in suggestions so that it would be up to the user to get
> those extra packages needed without requiring them in R CMD check?
> 
> I stumbled on this with earlier versions of R, and then my solution was
> to suggest nothing. 
> 
> cheers, jari oksanen

-- 
Roger D. Peng
http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Shy Suggestion?

2005-09-20 Thread Roger D. Peng

I think the reason is that the standard for 'R CMD check' is that examples in 
help pages are guaranteed to be executable by the user (as long as the 
requirements are met).  There is no way to guarantee this without having the 
packages installed.  So strictly speaking, the 'Suggested' packages are not 
needed by the *user*, but are needed by the *maintainer*.

Perhaps, you differ with the standard itself, but I personally think it's a 
good 
one.

-roger

Jari Oksanen wrote:
> On Tue, 2005-09-20 at 09:42 -0400, Roger D. Peng wrote:
> 
>>I think this needs to fail because packages listed in 'Suggests:' may, for 
>>example, be needed in the examples.  How can 'R CMD check' run the examples 
>>and 
>>verify that they are executable if those packages are not available?  I 
>>suppose 
>>you could put the examples in a \dontrun{}.
>>
> 
> Yes, that's what I do, and exactly for that reason: if something is not
> necessarily needed (= 'suggestion' in this culture), it should not be
> required in tests. However, if I don't use \dontrun{} for a
> non-recommended package, the check would fail and I would get the needed
> information: so why should the check fail already when checking
> DESCRIPTION?
> 
> cheers, jari oksanen
> 

-- 
Roger D. Peng
http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] random output with sub(fixed = TRUE)

2005-12-21 Thread Roger D. Peng

I've noticed what I think is curious behavior in using 'sub(fixed = TRUE)' and 
was wondering if my expectation is incorrect.  Here is one example:

v <- paste(0:10, "asdf", sep = ".")
sub(".asdf", "", v, fixed = TRUE)

The results I get are

 > sub(".asdf", "", v, fixed = TRUE)
  [1] "0"   "1\0st\0\0"   "2\0\001\0\0" "3\0\001\0\0"
  [5] "4\0mes\0""5\0\001\0\0" "6\0\0\0\0\0" "7\0\0\0m\0"
  [9] "8\0\0\0t\0"  "9\0\0\0\0"   "10\0\0\0\0\0"
 >

I expected "0" in the first entry and everything else would be unchanged.  Your 
results may vary since every time I run 'sub()' in this way, I get a slightly 
different answer in entires 2 through 11.

As it turns out, 'gsub(fixed = TRUE)' gives me the answer I *actually* wanted, 
which was to replace the string in every entry.  But I still think the behavior 
of 'sub(fixed = TRUE) is a bit odd.

 > version
  _
platform x86_64-unknown-linux-gnu
arch x86_64
os   linux-gnu
system   x86_64, linux-gnu
status
major2
minor2.1
year 2005
month12
day  20
svn rev  36812
language R
 >

-roger
-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] random output with sub(fixed = TRUE)

2005-12-21 Thread Roger D. Peng

Well, who am I to break this long-standing ritual? :)

Interestingly, while the printed output looks wrong, I get

 > v <- paste(0:10, "asdf", sep = ".")
 > a <- sub(".asdf", "", v, fixed = TRUE)
 > b <- as.character(0:10)
 > identical(a, b)
[1] TRUE
 >

-roger

Peter Dalgaard wrote:
> "Roger D. Peng" <[EMAIL PROTECTED]> writes:
> 
> 
>>I've noticed what I think is curious behavior in using 'sub(fixed = TRUE)' 
>>and 
>>was wondering if my expectation is incorrect.  Here is one example:
>>
>>v <- paste(0:10, "asdf", sep = ".")
>>sub(".asdf", "", v, fixed = TRUE)
>>
>>The results I get are
>>
>> > sub(".asdf", "", v, fixed = TRUE)
>>  [1] "0"   "1\0st\0\0"   "2\0\001\0\0" "3\0\001\0\0"
>>  [5] "4\0mes\0""5\0\001\0\0" "6\0\0\0\0\0" "7\0\0\0m\0"
>>  [9] "8\0\0\0t\0"  "9\0\0\0\0"   "10\0\0\0\0\0"
>> >
>>
>>I expected "0" in the first entry and everything else would be unchanged.  
>>Your 
>>results may vary since every time I run 'sub()' in this way, I get a slightly 
>>different answer in entires 2 through 11.
>>
>>As it turns out, 'gsub(fixed = TRUE)' gives me the answer I *actually* 
>>wanted, 
>>which was to replace the string in every entry.  But I still think the 
>>behavior 
>>of 'sub(fixed = TRUE) is a bit odd.
>>
>> > version
>>  _
>>platform x86_64-unknown-linux-gnu
>>arch     x86_64
>>os   linux-gnu
>>system   x86_64, linux-gnu
>>status
>>major2
>>minor2.1
>>year 2005
>>month12
>>day  20
>>svn rev  36812
>>language R
>> >
> 
> 
> Argh... 
> 
> year 2005
> month12
> day  21
> 
> and something like this gets discovered. It's a ritual, I tell ya, a ritual!
> 
> If you look at the output and terminate all strings at the embedded
> \0, it looks much more sensible, so it should be fairly easy to spot
> the cause of this bug...
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] checkpointing

2006-01-03 Thread Roger D. Peng

One possibility is to write in some checkpointing into your objective function, 
such as saving the current parameter values via 'save()' or 'dput()'.

-roger

Ross Boylan wrote:
> I would like to checkpoint some of my calculations in R, specifically
> those using optim.  As far as I can tell, R doesn't have this facility,
> and there seems to have been little discussion of it.
> 
> checkpointing is saving enough of the current state so that work can
> resume where things were left off if, to take my own example, the system
> crashes after 8 days of calculation.
> 
> My thought is that this could be added as an option to optim as one of
> the control parameters.
> 
> I thought I'd check here to see if anyone is aware of any work in this
> area or has any thoughts about how to proceed.  In particular, is save a
> reasonable way to save a few variables to disk?  I could also make the
> code available when/if I get it working.

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R on the brain

2006-01-30 Thread Roger D. Peng

Well shared! :)  Maybe better yet,

"Before I begin this talk, I'd like to 'attach("nano")'".

-roger

Ben Bolker wrote:
>I was sitting in the coffee room at work listening to people complain
> about a recent seminar about nanotechnology using the terms 
> nanofluidics, nanofactory, nano-this, and nano-that ... I found myself 
> thinking "well the speaker should just
> have said
>with(nano,
>   ...)
> 
>Un(?)fortunately there's no-one here I can share that thought with.
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Reproducible Research task view

2010-08-20 Thread Roger D. Peng

For what it's worth, I think this is a great idea.

-roger

On Fri, Aug 20, 2010 at 3:26 PM, Max Kuhn  wrote:
> I would like to suggest a Reproducible Research CRAN task view. This
> was discussed briefly in one of the useR! sessions this year.
>
> >From quick browse through CRAN, I counted 19 packages that were
> associated with Sweave or other methods for reproducible research. I
> think that we've gotten to a point where some additional documentation
> that enumerates/compares/contrasts the various packages would be
> helpful.
>
> I'd like to volunteer to create and maintain the task view.
>
> Any thoughts?
>
> Thanks,
>
> Max
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] NULL names for POSIXlt objects

2011-07-15 Thread Roger D. Peng

I noticed in current R-patched (and R-devel) the following behavior:

> names(as.POSIXlt("2003-01-01"))
NULL

which I believe previously listed the names of the different elements
(e.g. 'sec', 'mday', 'year', etc.).

It seems to be related to r54188. I see the code here is is wrapped
with  but I'm not sure it was meant to change the previous
behavior.

-roger

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] "What Calls What" diagram. Flow Chart?

2011-10-17 Thread Roger D. Peng

Paul, the 'cacher' package might have what you need. The 'graphcode'
and 'objectcode' functions might be useful. Granted, this package is
not really designed for visualizing code per se, but you can use the
functionality nonetheless.

-roger

On Sun, Oct 9, 2011 at 3:49 PM, Paul Johnson  wrote:
> I don't know the right computer science words for this question, I'm
> afraid. Apology in advance.
>
> How do you find your way around in somebody else's code?  If the user
> runs a specific command, and wants to know how the data is managed
> until the result is returned, what to do ?
>
> I've tried this manually with tools like mtrace and browser. This is a
> bit frustrating because the browser does not stay in effect until the
> work is done. It quits at the end of the function.  So you have to
> attach the browser to the functions that are called there, and so
> forth.  But that doesn't quite put everything together.
>
> Example.  Recently I was trying to find out where the package lavaan's
> calculations for the function cfa are actually done and it was a
> maddening chase from one function to the next, as data was
> re-organized and options were filled in. lavaan's "cfa" function
> reformats some options, then the work gets done by an eval.
>
> cfa> fit <- cfa(HS.model, data=HolzingerSwineford1939)
> debugging in: cfa(HS.model, data = HolzingerSwineford1939)
> debug: {
>    mc <- match.call()
>    mc$model.type = as.character(mc[[1L]])
>    if (length(mc$model.type) == 3L)
>        mc$model.type <- mc$model.type[3L]
>    mc$int.ov.free = TRUE
>    mc$int.lv.free = FALSE
>    mc$auto.fix.first = !std.lv
>    mc$auto.fix.single = TRUE
>    mc$auto.var = TRUE
>    mc$auto.cov.lv.x = TRUE
>    mc$auto.cov.y = TRUE
>    mc[[1L]] <- as.name("lavaan")
>    eval(mc, parent.frame())
> }
>
> The value of "mc" that gets executed by eval is this
>
> Browse[2]> mc
> lavaan(model.syntax = HS.model, data = HolzingerSwineford1939,
>    model.type = "cfa", int.ov.free = TRUE, int.lv.free = FALSE,
>    auto.fix.first = TRUE, auto.fix.single = TRUE, auto.var = TRUE,
>    auto.cov.lv.x = TRUE, auto.cov.y = TRUE)
>
> So then I need to but a debug on "lavaan" and step through that, see
> what it does.
>
> Is there a way to make a list of the functions that are called "pop
> out", possibly with a flow chart?
>
> Consider lm,  I want to know
>
> lm -> lm.fit ->  .Fortran("dqrls")
>
> I'm not asking for a conceptual UML diagram, so far as I know.
>
> The kind of trace information you get with gdb in C programs and
> shallow steps with "n" would probably help. I would not need to keep
> attaching more functions with debug.
>
> pj
>
> --
> Paul E. Johnson
> Professor, Political Science
> 1541 Lilac Lane, Room 504
> University of Kansas
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] On RObjectTables

2012-07-24 Thread Roger D. Peng

I was thinking that the RObjectTables appeared before the active
binding stuff was introduced. On quick examination, it seems the
active binding mechanism might be a more stable way to go. That's how
it was done in filehash.

-roger

On Tue, Jul 24, 2012 at 9:15 AM,   wrote:
> In my original reply I wrote
>
>
>>> The
>>> facility in its current form does complicate the internal code and
>>> limit some experiments we might otherwise do, so I would not be
>>> surprised if it was at least substantially changed in the next year or
>>> two.
>
>
> This is obviously _not_ the time to invest effort in documenting or
> expanding this facility. If you want to use it, go ahead. But you will
> have to figure things out from what you have and be prepared for
> changes under your feet.
>
> If active bindings can do what you want that may be a safer route to
> consider.
>
> luke
>
>
> On Tue, 24 Jul 2012, Jeroen Ooms wrote:
>
>> Maybe it is worth considering to document this functionality a bit
>> more, or expose some wrappers in R? It's a bit obscure right now,
>> which seems both dangerous in terms of maintenance and a missed
>> opportunity (especially if people are already building on it).
>>
>>
>>
>>
>>
>> On Tue, Jul 24, 2012 at 2:06 AM, Michael Lawrence
>>  wrote:
>>>
>>>
>>> Luke,
>>>
>>> Please keep me advised on this, because the Qt interfaces heavily rely on
>>> the ObjectTables (btw, it has worked great for my use cases).
>>>
>>> Michael
>>>
>>>
>>> On Fri, Jul 20, 2012 at 7:32 AM,  wrote:
>>>>
>>>>
>>>> I believe everyone who has worked on the relevant files has tried to
>>>> maintain this functionality, but as it seems to get used and tested
>>>> very little I can't be sure it is functional at this point. The
>>>> facility in its current form does complicate the internal code and
>>>> limit some experiments we might otherwise do, so I would not be
>>>> surprised if it was at least substantially changed in the next year or
>>>> two.
>>>>
>>>> Best,
>>>>
>>>> luke
>>>>
>>>>
>>>> On Thu, 19 Jul 2012, Jeroen Ooms wrote:
>>>>
>>>>> I was wondering if anyone knows more about the state of RObjectTables.
>>>>> This
>>>>> largely undocumented functionality was introduced by Duncan around 2002
>>>>> somewhere and enables you create an environment where the contents are
>>>>> dynamically queried by R through a hook function. It is mentioned in R
>>>>> Internals and ?attach. This functionality is quite powerful and allows
>>>>> you
>>>>> to e.g. offload a big database of R objects to disk, yet use them as if
>>>>> they were in your workspace. The recent RProtoBuf package also uses
>>>>> some of
>>>>> this functionality to dynamically lookup proto definitions.
>>>>>
>>>>> I would like to do something similar, but I am not sure if support for
>>>>> this
>>>>> functionality will be or has been discontinued. The RObjectTables
>>>>> package
>>>>> is no longer available on OmegaHat and nothing has not been mentioned
>>>>> on
>>>>> the mailing lists for about 5 years. I found an old version of the
>>>>> package
>>>>> no github which seems to work, but as far as I understand, the package
>>>>> still needs the hooks from within R to work. So if this functionality
>>>>> is
>>>>> actually unsupported and might be removed at some point, I should
>>>>> probably
>>>>> not invest in it.
>>>>>
>>>>> [[alternative HTML version deleted]]
>>>>>
>>>>> __
>>>>> R-devel@r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>
>>>>
>>>> --
>>>> Luke Tierney
>>>> Chair, Statistics and Actuarial Science
>>>> Ralph E. Wareham Professor of Mathematical Sciences
>>>> University of Iowa  Phone: 319-335-3386
>>>> Department of Statistics andFax:   319-335-3017
>>>>Actuarial Science
>>>> 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
>>>> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
>>>>
>>>>
>>>> __
>>>> R-devel@r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>>
>>>
>>
>
> --
> Luke Tierney
> Chair, Statistics and Actuarial Science
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa  Phone: 319-335-3386
> Department of Statistics andFax:   319-335-3017
>Actuarial Science
> 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] extracting rows from a data frame by looping over the row names: performance issues

2007-03-02 Thread Roger D. Peng

Extracting rows from data frames is tricky, since each of the columns could be 
of a different class.  For your toy example, it seems a matrix would be a more 
reasonable option.

R-devel has some improvements to row extraction, if I remember correctly.  You 
might want to try your example there.

-roger

Herve Pages wrote:
> Hi,
> 
> 
> I have a big data frame:
> 
>   > mat <- matrix(rep(paste(letters, collapse=""), 5*30), ncol=5)
>   > dat <- as.data.frame(mat)
> 
> and I need to do some computation on each row. Currently I'm doing this:
> 
>   > for (key in row.names(dat)) { row <- dat[key, ]; ... do some computation 
> on row... }
> 
> which could probably considered a very natural (and R'ish) way of doing it
> (but maybe I'm wrong and the real idiom for doing this is something 
> different).
> 
> The problem with this "idiomatic form" is that it is _very_ slow. The loop
> itself + the simple extraction of the rows (no computation on the rows) takes
> 10 hours on a powerful server (quad core Linux with 8G of RAM)!
> 
> Looping over the first 100 rows takes 12 seconds:
> 
>   > system.time(for (key in row.names(dat)[1:100]) { row <- dat[key, ] })
>  user  system elapsed
>12.637   0.120  12.756
> 
> But if, instead of the above, I do this:
> 
>   > for (i in nrow(dat)) { row <- sapply(dat, function(col) col[i]) }
> 
> then it's 20 times faster!!
> 
>   > system.time(for (i in 1:100) { row <- sapply(dat, function(col) col[i]) })
>  user  system elapsed
> 0.576   0.096   0.673
> 
> I hope you will agree that this second form is much less natural.
> 
> So I was wondering why the "idiomatic form" is so slow? Shouldn't the 
> idiomatic
> form be, not only elegant and easy to read, but also efficient?
> 
> 
> Thanks,
> H.
> 
> 
>> sessionInfo()
> R version 2.5.0 Under development (unstable) (2007-01-05 r40386)
> x86_64-unknown-linux-gnu
> 
> locale:
> LC_CTYPE=en_US;LC_NUMERIC=C;LC_TIME=en_US;LC_COLLATE=en_US;LC_MONETARY=en_US;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C
> 
> attached base packages:
> [1] "stats" "graphics"  "grDevices" "utils" "datasets"  "methods"
> [7] "base"
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] tilde expansion with install.packages

2007-04-05 Thread Roger D. Peng

I've noticed recently that 'update.packages' and 'install.packages' seem to not 
do tilde expansion anymore, i.e. when I run


update.packages("~/R-local/lib")

on R-alpha (r41043) I get the message

/home/rpeng/install/R-alpha/lib64/R/bin/INSTALL: line 304: cd: ~/R-local/lib: No 
such file or directory


and the package is subsequently installed in the current working directory.

The attached patch solves the problem for me.

-roger
--
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/
Index: src/scripts/INSTALL.in
===
--- src/scripts/INSTALL.in  (revision 41043)
+++ src/scripts/INSTALL.in  (working copy)
@@ -294,6 +294,7 @@
   shift
 done
 
+lib=`tilde_expand "${lib}"`
 if test -z "${lib}"; then
   lib=`echo "cat('~~~', .libPaths()[1], sep = '')" | \
 R_DEFAULT_PACKAGES=NULL "${R_EXE}" --no-save --slave | \
@@ -332,7 +333,6 @@
   exit 1
 fi
 
-lib=`tilde_expand "${lib}"`
 if (test -d "${lib}" && test -w "${lib}") || \
 ${MKDIR_P} "${lib}" 2> /dev/null; then
   lib=`cd "${lib}" && ${GETWD}`
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Imports/exports of S4 methods

2007-05-03 Thread Roger D. Peng

I have a question about what to do in the following situation (please bear with 
the setup):

Package A defines an S4 generic 'foo' and as well as S4 methods for 'foo' and 
has

exportMethods("foo")

in its NAMESPACE file.

Package B defines another method for 'foo' for class "bar" and has

importFrom(A, "foo")
exportMethods("foo")
exportClasses("bar")

in its NAMESPACE file.  Should Package B also have Package A in the 'Depends:' 
field of the DESCRIPTION file or is it correct to import Package A only?

Finally, Package C has a single exported function named 'myfunc' which needs to 
use the method for 'foo' defined in Package B, so its NAMESPACE file has

importFrom(A, "foo")
importMethodsFrom(B, "foo")
importClassesFrom(B, "bar")
export("myfunc")

Is this the correct thing to do?

The error I get under this setup is that 'myfunc' cannot find the method for 
'foo' defined in Package B when 'myfunc' calls 'foo' on an object of class 
"bar".

If you've made it this far I'm already grateful!  Any help with this would be 
appreciated.

-roger
-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] isOpen on closed connections

2007-11-14 Thread Roger D. Peng

As far as I can tell, 'isOpen' cannot return FALSE in the case when 'rw = ""'. 
If the connection has already been closed by 'close' or some other function, 
then isOpen will produce an error.  The problem is that when isOpen calls 
'getConnection', the connection cannot be found and 'getConnection' produces an 
error.  The check to see if it is open is never actually done.

This came up in some code where I'm trying to clean up connections after 
successfully opening them.  The problem is that if I try to close a connection 
that has already been closed, I get an error (because 'getConnection' cannot 
find it).  But then there's no way for me to find out if a connection has 
already been closed.  Perhaps there's another approach I should be taking?  The 
context is basically,

con <- file("foo", "w")

tryCatch({
## Do stuff that might fail
writeLines(stuff, con)
close(con)

file.copy("foo", "bar")
}, finally = {
close(con)
})

So the problem is that if the block in the 'tryCatch' succeeds, the 'finally' 
will produce an error.

I'm not exactly sure of what I'd want since it seems modifying 'getConnection' 
would not be a great idea as it is used elsewhere.

-roger
-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] isOpen on closed connections

2007-11-15 Thread Roger D. Peng

Upon further consideration, I realized there is a philosophical element 
here---if a connection is closed and hence does not exist, is it open?

The practical issue for me is that when you do something like

close(con)

the 'con' object is still lying around and is essentially undefined.  For 
example, if I do

close(con)
con <- "hello"

then it seems logical to me that 'isOpen' would return an error.  But it feels 
natural to me that a sequence like

close(con)
isOpen(con)  ## FALSE?

would not lead to an error.  Perhaps my expectations are not reasonable and I'd 
appreciate being corrected.

Given Brian's comment, one solution would be to allowing closing but not 
destroying connections at the R level (maybe via an option?), but that is a 
change in semantics and I'm not sure if this problem really comes up that much.

-roger

Prof Brian Ripley wrote:
> I think the confusion here is over close(): that closes *and destroys* a 
> connection, so it no longer exists.
> 
> isOpen applies to existing connections: you cannot close but not destroy 
> them at R level, but C code can (and does).  You will see it in use in 
> the utils package.
> 
> 
> On Wed, 14 Nov 2007, Seth Falcon wrote:
> 
>> "Roger D. Peng" <[EMAIL PROTECTED]> writes:
>>
>>> As far as I can tell, 'isOpen' cannot return FALSE in the case when 
>>> 'rw = ""'.
>>> If the connection has already been closed by 'close' or some other 
>>> function,
>>> then isOpen will produce an error.  The problem is that when isOpen 
>>> calls
>>> 'getConnection', the connection cannot be found and 'getConnection' 
>>> produces an
>>> error.  The check to see if it is open is never actually done.
>>
>> I see this too with R-devel (r43376) {from Nov 6th}.
>>
>>con = file("example1", "w")
>>isOpen(con)
>>
>>[1] TRUE
>>
>>showConnections()
>>
>>  description class  mode text   isopen   can read can write
>>3 "example1"  "file" "w"  "text" "opened" "no" "yes"
>>
>>close(con)
>>isOpen(con)
>>
>>Error in isOpen(con) : invalid connection
>>
>>## printing also fails
>>con
>>Error in summary.connection(x) : invalid connection
>>
>>> This came up in some code where I'm trying to clean up connections after
>>> successfully opening them.  The problem is that if I try to close a 
>>> connection
>>> that has already been closed, I get an error (because 'getConnection' 
>>> cannot
>>> find it).  But then there's no way for me to find out if a connection 
>>> has
>>> already been closed.  Perhaps there's another approach I should be 
>>> taking?  The
>>> context is basically,
>>>
>>> con <- file("foo", "w")
>>>
>>> tryCatch({
>>> ## Do stuff that might fail
>>> writeLines(stuff, con)
>>> close(con)
>>>
>>> file.copy("foo", "bar")
>>> }, finally = {
>>> close(con)
>>> })
>>
>> This doesn't address isOpen, but why do you have the call to close
>> inside the tryCatch block?  Isn't the idea that finally will always be
>> run and so you can be reasonably sure that close gets called once?
>>
>> If your real world code is more complicated, perhaps you can make use
>> of a work around like:
>>
>> myIsOpen = function(con) tryCatch(isOpen(con), error=function(e) FALSE)
>>
>> You could do similar with myClose and "close" a connection as many
>> times as you'd like :-)
>>
>> + seth
>>
>>
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] New version of X11, png and jpeg

2008-02-26 Thread Roger D. Peng

will produce 
>>>> higher-quality
>>>>  output with more features.
>>>>
>>>>  Pros:
>>>>
>>>>  Antialiasing of text and lines (can be turned off) but no blurring of
>>>>  fills.
>>>>
>>>>  Buffering of the X11 display and fast repainting from a backing image.
>>>>  (The intention is to emulate the timer-based buffering of the windows()
>>>>  device in due course, but not for 2.7.0.)
>>>>
>>>>  Ability to use translucent colours, including backgrounds, and produce
>>>>  partially transparent PNG files.
>>>>
>>>>  Scalable text, including to sizes like 4.5 pt. This allows more accurate
>>>>  sizing on non-standard screen sizes (e.g. my home machine has a 90dpi
>>>>  1650x1024 display whereas standard X11 fonts are set up for 75 or 100
>>>>  dpi).
>>>>
>>>>  Full support for UTF-8, so on systems with suitable fonts you can plot in
>>>>  many languages on a single figure (and this will work even in non-UTF-8
>>>>  locales).  The output should be locale-independent (unlike the current
>>>>  devices where even English text is rendered slightly differently in
>>>>  Latin-1 and UTF-8 locales).
>>>>
>>>>  A utility function savePlot() to make a PNG/JPEG/TIFF copy of the current
>>>>  plot.
>>>>
>>>>  The new png() and jpeg() devices do not require an X server to be 
>>>> running.
>>>>
>>>>  Cons:
>>>>
>>>>  Needs more software installed - cairo, pango and support packages (which
>>>>  on all the systems we have looked at are pulled in by the packages 
>>>> checked
>>>>  for).  You will see something like
>>>>
>>>>Additional capabilities:   PNG, JPEG, iconv, MBCS, NLS, cairo
>>>>^
>>>>  if configure finds the software we are looking for.
>>>>
>>>>  Slower under some circumstances (although on the test systems much faster
>>>>  than packages Cairo and cairoDevice).  This will be particularly true for
>>>>  X11() with a slow connection between the machine running R and the X
>>>>  server.
>>>>
>>>>  The additional software might not work correctly.
>>>>
>>>>
>>>>  The new versions are not currently the default, but can be made so by
>>>>  setting X11.options(type="Cairo"), e.g. as a load hook for package
>>>>  grDevices.  I am using
>>>>
>>>>  setHook(packageEvent("grDevices", "onLoad"),
>>>>  function(...) {
>>>>  grDevices::ps.options(horizontal=FALSE)
>>>>  if(getRversion() >= '2.7.0') grDevices::X11.options(type="Cairo")
>>>>  })
>>>>
>>>>
>>>>  Please try these out and let us know how you get on.  As a check, try the
>>>>  TestChars() examples in ?points - on one Solaris 10 system a few of the
>>>>  symbol font characters were incorrect.  It worked on an FC5 system with
>>>>
>>>>  auk% pkg-config --modversion pango
>>>>  1.12.4
>>>>  auk% pkg-config --modversion cairo
>>>>  1.0.4
>>>>
>>>>  so the versions required are not all recent.
>>>>
>>>>  Although these devices would in principle work on Mac OS X, neither cairo
>>>>  nor pango is readily available.  We are working on other versions for
>>>>  Mac OS (X11 based on cairo/freetype, png/jpeg based on Quartz).
>>>>
>>>>  There are also new svg() and tiff() devices.
>>>>
>>>>  --
>>>>  Brian D. Ripley,  [EMAIL PROTECTED]
>>>>  Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>>>>  University of Oxford, Tel:  +44 1865 272861 (self)
>>>>  1 South Parks Road, +44 1865 272866 (PA)
>>>>  Oxford OX1 3TG, UKFax:  +44 1865 272595
>>>>
>>>>  __
>>>>  R-devel@r-project.org mailing list
>>>>  https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>
>>>
>>>
>>> --
>>> Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/
>>>
>> -- 
>> Brian D. Ripley,  [EMAIL PROTECTED]
>> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>> University of Oxford, Tel:  +44 1865 272861 (self)
>> 1 South Parks Road, +44 1865 272866 (PA)
>> Oxford OX1 3TG, UKFax:  +44 1865 272595
>>
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] looking for R_approx in 2.6

2008-02-28 Thread Roger D. Peng

Looks like it was removed in r42551 and from the comment it appears it was not 
part of the R API anyway.

-roger

Peter Kharchenko wrote:
> Hi there. I was wondering what happened to R_approx from 
> R_ext/Applic.h   ... it seems to have dissapeared in 2.6.x, and I can't 
> seem to find it simply listed in some other header file.
> thanks,
> -peter.
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] cut.Date and cut.POSIXt problem

2008-03-18 Thread Roger D. Peng

Seems it did work as advertised in R 2.6.0 but is still broken in R-devel.  
Will 
take a look.

-roger

Gabor Grothendieck wrote:
> cut.Date and cut.POSIXt indicate that the breaks argument
> can be an integer followed by a space followed by "year", etc.
> but it seems the integer is ignored.
> 
> For example, I assume that breaks = "3 months" is supposed
> to cut it into quarters but, in fact, it cuts it into months as if
> 3 had not been there.
> 
>> d <- seq(Sys.Date(), length = 12, by = "month")
>> cut(d, "3 months")
>  [1] 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01 2008-08-01
> 2008-09-01 2008-10-01 2008-11-01 2008-12-01 2009-01-01 2009-02-01
> Levels: 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01
> 2008-08-01 2008-09-01 2008-10-01 2008-11-01 2008-12-01 2009-01-01
> 2009-02-01
>> cut(as.POSIXct(d), "3 months")
>  [1] 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01 2008-08-01
> 2008-09-01 2008-10-01 2008-11-01 2008-12-01 2009-01-01 2009-02-01
> Levels: 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01
> 2008-08-01 2008-09-01 2008-10-01 2008-11-01 2008-12-01 2009-01-01
> 2009-02-01
>> cut(as.POSIXlt(d), "3 months")
>  [1] 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01 2008-08-01
> 2008-09-01 2008-10-01 2008-11-01 2008-12-01 2009-01-01 2009-02-01
> Levels: 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01
> 2008-08-01 2008-09-01 2008-10-01 2008-11-01 2008-12-01 2009-01-01
> 2009-02-01
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] cut.Date and cut.POSIXt problem

2008-03-18 Thread Roger D. Peng

Seems changes in r44116 force the interval to be single months (or years) 
instead of whatever the user specified.  I think the attached patches correct this.


Interestingly, 'cut' and 'seq' allow for the 'breaks' specification to be 
something like "3 months" but the documentation for 'hist' does not allow for 
this type of specification.


-roger

Gabor Grothendieck wrote:

cut.Date and cut.POSIXt indicate that the breaks argument
can be an integer followed by a space followed by "year", etc.
but it seems the integer is ignored.

For example, I assume that breaks = "3 months" is supposed
to cut it into quarters but, in fact, it cuts it into months as if
3 had not been there.


d <- seq(Sys.Date(), length = 12, by = "month")
cut(d, "3 months")

 [1] 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01 2008-08-01
2008-09-01 2008-10-01 2008-11-01 2008-12-01 2009-01-01 2009-02-01
Levels: 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01
2008-08-01 2008-09-01 2008-10-01 2008-11-01 2008-12-01 2009-01-01
2009-02-01

cut(as.POSIXct(d), "3 months")

 [1] 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01 2008-08-01
2008-09-01 2008-10-01 2008-11-01 2008-12-01 2009-01-01 2009-02-01
Levels: 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01
2008-08-01 2008-09-01 2008-10-01 2008-11-01 2008-12-01 2009-01-01
2009-02-01

cut(as.POSIXlt(d), "3 months")

 [1] 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01 2008-08-01
2008-09-01 2008-10-01 2008-11-01 2008-12-01 2009-01-01 2009-02-01
Levels: 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01
2008-08-01 2008-09-01 2008-10-01 2008-11-01 2008-12-01 2009-01-01
2009-02-01

______
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/
diff --git a/src/library/base/R/dates.R b/src/library/base/R/dates.R
index 9496b1d..e69f35c 100644
--- a/src/library/base/R/dates.R
+++ b/src/library/base/R/dates.R
@@ -324,7 +324,7 @@ cut.Date <-
 end <- as.POSIXlt(max(x, na.rm = TRUE))
 end <- as.POSIXlt(end + (31 * 86400))
 end$mday <- 1
-breaks <- as.Date(seq(start, end, "months"))
+breaks <- as.Date(seq(start, end, breaks))
 } else if(valid == 4) {
 start$mon <- 0
 start$mday <- 1
@@ -332,7 +332,7 @@ cut.Date <-
 end <- as.POSIXlt(end + (366 * 86400))
 end$mon <- 0
 end$mday <- 1
-breaks <- as.Date(seq(start, end, "years"))
+breaks <- as.Date(seq(start, end, breaks))
 } else {
 start <- .Internal(POSIXlt2Date(start))
 if (length(by2) == 2) incr <- incr * as.integer(by2[1])
diff --git a/src/library/base/R/datetime.R b/src/library/base/R/datetime.R
index 95e513f..67cbea2 100644
--- a/src/library/base/R/datetime.R
+++ b/src/library/base/R/datetime.R
@@ -729,7 +729,7 @@ cut.POSIXt <-
 end <- as.POSIXlt(max(x, na.rm = TRUE))
 end <- as.POSIXlt(end + (31 * 86400))
 end$mday <- 1
-breaks <- seq(start, end, "months")
+breaks <- seq(start, end, breaks)
 } else if(valid == 7) {
 start$mon <- 0
 start$mday <- 1
@@ -737,7 +737,7 @@ cut.POSIXt <-
 end <- as.POSIXlt(end + (366 * 86400))
 end$mon <- 0
 end$mday <- 1
-breaks <- seq(start, end, "years")
+breaks <- seq(start, end, breaks)
 } else {
 if (length(by2) == 2) incr <- incr * as.integer(by2[1])
maxx <- max(x, na.rm = TRUE)
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] cut.Date and cut.POSIXt problem

2008-03-26 Thread Roger D. Peng

I have applied these patches to R-devel and in my limited testing they appear 
to 
work as desired.  I have to say that I never ran into the problem these patches 
were meant to solve so I may not be the best person to do the testing.

-roger

Marc Schwartz wrote:
> Hi all,
> 
> Apologies for the delay in my engaging in this thread. I was traveling 
> this week.
> 
> The problem that Gabor raised was caused by the patch that I submitted 
> to fix a problem with the referenced functions when using 'months' and 
> 'years' as the interval. The prior versions were problematic:
> 
>   https://stat.ethz.ch/pipermail/r-devel/2008-January/048004.html
> 
> The patch fixed the error, but since I used hist.Date() as the reference 
> model and did not note the subtle difference in cut.Date() relative to 
> specifying the breaks increment value, this functionality was lost when 
> the same modification was made to the code in cut.Date().
> 
> Roger's patch helps, but does not totally remedy the situation. One also 
> needs to modify the method used for specifying the max value 'end' for 
> the breaks in order to include the max 'x' Date value in the result.
> 
> Hence, I am attaching proposed patches against R-devel for 
> base:::dates.R and base:::datetime.R.
> 
> I am also attaching a patch for tests:::reg-tests-1.R to add a check for 
> this situation to the regression tests that were also added subsequent 
> to that prior set of patches that I had submitted.
> 
> If perhaps Roger and Gabor could so some testing on these patches before 
> they are considered for inclusion into the R-devel tree, it would be 
> helpful to check to see if I have missed something else here.
> 
> Thanks for raising this issue.
> 
> Regards,
> 
> Marc Schwartz
> 
> Roger D. Peng wrote:
>> Seems changes in r44116 force the interval to be single months (or 
>> years) instead of whatever the user specified.  I think the attached 
>> patches correct this.
>>
>> Interestingly, 'cut' and 'seq' allow for the 'breaks' specification to 
>> be something like "3 months" but the documentation for 'hist' does not 
>> allow for this type of specification.
>>
>> -roger
>>
>> Gabor Grothendieck wrote:
>>> cut.Date and cut.POSIXt indicate that the breaks argument
>>> can be an integer followed by a space followed by "year", etc.
>>> but it seems the integer is ignored.
>>>
>>> For example, I assume that breaks = "3 months" is supposed
>>> to cut it into quarters but, in fact, it cuts it into months as if
>>> 3 had not been there.
>>>
>>>> d <- seq(Sys.Date(), length = 12, by = "month")
>>>> cut(d, "3 months")
>>>  [1] 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01 2008-08-01
>>> 2008-09-01 2008-10-01 2008-11-01 2008-12-01 2009-01-01 2009-02-01
>>> Levels: 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01
>>> 2008-08-01 2008-09-01 2008-10-01 2008-11-01 2008-12-01 2009-01-01
>>> 2009-02-01
>>>> cut(as.POSIXct(d), "3 months")
>>>  [1] 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01 2008-08-01
>>> 2008-09-01 2008-10-01 2008-11-01 2008-12-01 2009-01-01 2009-02-01
>>> Levels: 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01
>>> 2008-08-01 2008-09-01 2008-10-01 2008-11-01 2008-12-01 2009-01-01
>>> 2009-02-01
>>>> cut(as.POSIXlt(d), "3 months")
>>>  [1] 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01 2008-08-01
>>> 2008-09-01 2008-10-01 2008-11-01 2008-12-01 2009-01-01 2009-02-01
>>> Levels: 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01
>>> 2008-08-01 2008-09-01 2008-10-01 2008-11-01 2008-12-01 2009-01-01
>>> 2009-02-01
>>>
> 
> 
> 
> --- datesORIG.R   2008-03-20 14:25:13.0 -0500
> +++ dates.R   2008-03-20 14:38:21.0 -0500
> @@ -322,17 +322,19 @@
>   if(valid == 3) {
>  start$mday <- 1
>  end <- as.POSIXlt(max(x, na.rm = TRUE))
> -end <- as.POSIXlt(end + (31 * 86400))
> +step <- ifelse(length(by2) == 2, as.integer(by2[1]), 1)
> +end <- as.POSIXlt(end + (31 * step * 86400))
>  end$mday <- 1
> -breaks <- as.Date(seq(start, end, "months"))
> +breaks <- as.Date(seq(start, end, breaks))
>  } else if(valid == 4) {
>  start$mon <- 0
>  start$mday <- 1
>

[Rd] suppressing 'download.file' progress window

2008-03-26 Thread Roger D. Peng

In Windows, 'download.file' pops up a separate window indicating the progress 
of 
the download (even when 'quiet = TRUE').  This is useful and informative when 
downloading a single (large) file.  However, when downloading multiple 
(smaller) 
files in sucession, the constant flashing of the progress window can be a 
little 
disorienting (at least for me).

Is there a way to suppress this progress window?  Or is it an operating system 
feature?

-roger
-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] suppressing 'download.file' progress window

2008-03-26 Thread Roger D. Peng

Thanks!  The patch to src/modules/internet/internet.c in r44937 was just what I 
was looking for.

-roger

Prof Brian Ripley wrote:
> On Wed, 26 Mar 2008, Roger D. Peng wrote:
> 
>> In Windows, 'download.file' pops up a separate window indicating the 
>> progress of the download (even when 'quiet = TRUE').  This is useful 
>> and informative when downloading a single (large) file.  However, when 
>> downloading multiple (smaller) files in sucession, the constant 
>> flashing of the progress window can be a little disorienting (at least 
>> for me).
>>
>> Is there a way to suppress this progress window?  Or is it an 
>> operating system
>> feature?
> 
> The write up was somewhat confusing.  'quiet=TRUE' suppresses the status 
> messages on all platforms, and the (text) progress bar.  Only in R-devel 
> does it suppress the progress bar widget on Windows.
> 
> That said, the main time I download many files is via update.packages() 
> or install.packages(), and there was no provision to set quiet=TRUE from 
> those: I've just added ... to allow that.
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Extractor function for standard deviation.

2008-04-04 Thread Roger D. Peng

I'm not sure the idea of having a generic 'sd' extract the residual standard 
deviation from a linear model quite jives with me.  Personally, I think a 
generic function with name like 'residSD' or something similar would be better.

Either way, I think it would be nice to have some function like this.

-roger

Rolf Turner wrote:
> I have from time to time seen inquiries on r-help in respect of
> how to obtain the estimated standard deviation from the output of
> fitting a linear model.  And have had occasion to want to do this
> myself.
> 
> The way I currently do it is something like summary(fit)$sigma
> (where fit is returned by lm()).
> 
> It strikes me that it might be a good idea to have an extractor
> function, analogous with coef() and resid(), to dig out this value.
> This would be in keeping with the R philosophy of ``don't muck
> about with the internal components of objects returned by functions,
> because the internal structure of such objects might change in future
> releases''.
> 
> I'm not sure what the name of the extractor function should be.
> One idea would be to make sd() generic, and create a function
> sd.lm() which could currently have code
> 
>   sd.lm <- function(x,...) {
>   summary(x)$sigma
>   }
> 
> The sd.default() function would have the code of the current sd().
> 
> Does this idea have any merit?
> 
>   cheers,
> 
>   Rolf Turner
> 
> ##
> Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R CMD check should check date in description

2008-04-04 Thread Roger D. Peng

I don't think having 'R CMD check' spit out a warning about the date would be 
all that productive.  I do think it would be nice to have 'R CMD build' add a 
Date: field to the DESCRIPTION file if there isn't already a Date: field.  And 
I 
think such an addition would solve the first problem since maintainers wouldn't 
have to bother maintaining a date field.

-roger

Kurt Hornik wrote:
>>>>>> hadley wickham writes:
> 
>> I'm always forgetting to update the date in DESCRIPTION.  Would it be
>> possible to add a warning to R CMD check if it's old?
> 
> I recently thought about this.  I see several issues.
> 
> * How can we determine if it is "old"?  Relative to the time when the
>   package was uploaded to a repository?
> 
> * Some developers might actually want a different date for a variety of
>   reasons ...
> 
> * What we currently say in R-exts is
> 
>  The optional `Date' field gives the release date of the current
>  version of the package.  It is strongly recommended to use the
>  -mm-dd format conforming to the ISO standard.
> 
>   Many packages do not comply with the latter (but I have some code to
>   sanitize most of these), and "release date" may be a moving target.
> 
> The best that I could think of is to teach R CMD build to *add* a Date
> field if there was none.
> 
> Best
> -k
> 
> 
> 
>> Hadley
> 
>> -- 
>> http://had.co.nz/
> 
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R CMD check should check date in description

2008-04-04 Thread Roger D. Peng

Actually, now that I think about it, 'R CMD build' already adds the 'Packaged:' 
field, so perhaps it wouldn't really make sense to add yet another field with 
exactly the same information

-roger

Robert Gentleman wrote:
> 
> Kurt Hornik wrote:
>>>>>>> hadley wickham writes:
>>>> I recently thought about this.  I see several issues.
>>>>
>>>> * How can we determine if it is "old"?  Relative to the time when the
>>>> package was uploaded to a repository?
>>>>
>>>> * Some developers might actually want a different date for a variety of
>>>> reasons ...
>>>>
>>>> * What we currently say in R-exts is
>>>>
>>>> The optional `Date' field gives the release date of the current
>>>> version of the package.  It is strongly recommended to use the
>>>> -mm-dd format conforming to the ISO standard.
>>>>
>>>> Many packages do not comply with the latter (but I have some code to
>>>> sanitize most of these), and "release date" may be a moving target.
>>>>
>>>> The best that I could think of is to teach R CMD build to *add* a Date
>>>> field if there was none.
>>> That sounds like a good solution to me.
>> Ok.  However, 2.7.0 feature freeze soon ...
> 
>Please no.  If people want one then they should add it manually. It 
> is optional, and some of us have explicitly opted out and would like to 
> continue to do so.
> 
> 
>>> Otherwise, maybe just a message from R CMD check?  i.e. just like
>>> failing the codetools checks, it might be perfectly ok, but you should
>>> be doing it consciously, not by mistake.
>> I am working on that, too (e.g. a simple NOTE in case the date spec
>> cannot be canonicalized, etc.).  If file time stamps were realiable, we
>> could compare these to the given date.  This is I guess all we can do
>> for e.g. CRAN's daily checking (where comparing to the date the check
>> is run is not too useful) ...
> 
>But definitely not a warning.
> 
>Robert
> 
>> Best
>> -k
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] X11 image problem in R-2.8.0 Under development / R-2.7

2008-04-07 Thread Roger D. Peng

R> for now things are tuned for the most common case of a
>>BDR> local X11 display (although type = "nbcairo" works
>>BDR> quite well over my home wireless network).
>>
>> Sorry to chime in;  I had wanted to bring this to Brian's
>> attention a few days ago, but always got side-tracked:
>>
>> Here are results on my  IBM/Lenowo X41 notebook
>> model name   : Intel(R) Pentium(R) M processor 1.50GHz
>> cpu MHz  : 600.000
>> bogomips : 1198.37
>> MemTotal:  1546600 kB
>>
>> with *no* graphics acceleration:
>> Everything is local but not really fast hardware
>>
>>> F <- ecdf(rnorm(1))
>>> system.time(plot(F))
>>user  system elapsed
>>   3.204   0.024   3.402
>>> x11(type="Xlib")
>>> system.time(plot(F))
>>user  system elapsed
>>   0.068   0.000   0.354
>> which is somewhat dramatic,
>> and for tk*() pseudo-animations, I have definitely needed to use
>> type = "Xlib"
>>
>> Martin
>>
>>
>>>> R version 2.8.0 Under development (unstable) (2008-04-05
>>>> r45102)
>>>>
>>>> Thank you again for tracking down the original issue.
>>>>
>>>> Martin
>>>>
>>>> Prof Brian Ripley <[EMAIL PROTECTED]> writes:
>>>>
>>>>> I think I have found this -- if so, it was an X11 timing
>>>>> issue and we needed to re-read the X11 window size at a
>>>>> later time.  Please try r45102 or later.
>>>>>
>>>>> On Fri, 4 Apr 2008, Prof Brian Ripley wrote:
>>>>>
>>>>>> On Thu, 3 Apr 2008, Martin Morgan wrote:
>>>>>>
>>>>>>> I apologize if this is too obscure to reproduce, or
>>>>>>> some idiosyncratic aspects of my system. If I create a
>>>>>>> plot, e.g.,
>>>>>>> plot(1:10) I get a graphics device as expected. I then
>>>>>>> click on the 'zoom' box on my X11 window, so the
>>>>>>> window expands to occupy the entire screen. The plot
>>>>>>> is redrawn at the scale of the large window, but is
>>>>>>> clipped to the 'unzoomed' size. I only see the top
>>>>>>> left portion of the plot, occupying the space of the
>>>>>>> original image.  Here are the R essentials; I'm using
>>>>>>> X11 on a recent SuSE, connecting via a moderately
>>>>>>> out-of-date cygwin from Windows. I'm happy to provide
>>>>>>> more detail if pointed in the right direction (and
>>>>>>> will trouble shoot myself if this is not a general
>>>>>>> problem).
>>>>>>
>>>>>> We've seen it, but not all systems do it.  At present
>>>>>> it looks like a cairo bug, but more work is needed on
>>>>>> it.  If we haven't found a workaround by release time,
>>>>>> it will be documented on the help page.
>>>>>>
>>>>>>> sessionInfo()
>>>>>>> R version 2.8.0 Under development (unstable)
>>>>>>> (2008-04-03 r45066) x86_64-unknown-linux-gnu locale:
>>>>>>> 
>> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>>>>>>> attached base packages: [1] stats graphics grDevices
>>>>>>> utils datasets methods base
>>>>>>> capabilities() jpeg png tcltk X11 aqua http/ftp sockets
>>>>>>> libxml TRUE TRUE TRUE TRUE FALSE TRUE TRUE TRUE fifo
>>>>>>> cledit iconv NLS profmem cairo TRUE TRUE TRUE TRUE
>>>>>>> TRUE TRUE Martin
>>>>>>> --
>>>>>>> Martin Morgan Computational Biology / Fred Hutchinson
>>>>>>> Cancer Research Center 1100 Fairview Ave. N.  PO Box
>>>>>>> 19024 Seattle, WA 98109 Location: Arnold Building M2
>>>>>>> B169 Phone: (206) 667-2793
>>>>>>> __
>>>>>>> R-devel@r-project.org mailing list
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>>>
>>>>>>
>>>>> --
>>>>> Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied
>>>>>> Statistics, http://www.stats.ox.ac.uk/~ripley/
>>>>>> University of Oxford, Tel: +44 1865 272861 (self) 1
>>>>>> South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG,
>>>>>> UK Fax: +44 1865 272595
>>>>>>
>>>>>
>>>> --
>>>> Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied
>>>>> Statistics, http://www.stats.ox.ac.uk/~ripley/
>>>>> University of Oxford, Tel: +44 1865 272861 (self) 1
>>>>> South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG,
>>>>> UK Fax: +44 1865 272595
>>>>
>>>> --
>>>> Martin Morgan Computational Biology / Fred Hutchinson
>>>> Cancer Research Center 1100 Fairview Ave. N.  PO Box
>>>> 19024 Seattle, WA 98109
>>>>
>>>> Location: Arnold Building M2 B169 Phone: (206) 667-2793
>>>>
>>
>> --
>> Brian D. Ripley,  [EMAIL PROTECTED]
>>BDR> Professor of Applied Statistics,
>>BDR> http://www.stats.ox.ac.uk/~ripley/ University of
>>BDR> Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road,
>>BDR> +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865
>>BDR> 272595
>>
>>BDR> __
>>BDR> R-devel@r-project.org mailing list
>>BDR> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] C versions of serialize/unserialize in packages

2008-07-31 Thread Roger D. Peng

Are the functions 'R_Unserialize' and 'R_InitFileInPStream' allowed to
be used in R packages?  I guess I'm just not clear on the implications
of this comment in 'Rinternals.h':

/* The connection interface is not yet available to packages.  To
   allow limited use of connection pointers this defines the opaque
   pointer type. */

I have a function in the 'filehash' package that unserializes a bunch
of objects from a file and it seems to run much faster in C than in R.
 But I don't want to release something that uses a non-public
function/interface.

Thanks,
-roger
-- 
Roger D. Peng | http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] C versions of serialize/unserialize in packages

2008-07-31 Thread Roger D. Peng

Hmm...I don't think so.  The function I'm talking about loops over a
file many times and does a lot of 'seeks'.  I think just implementing
the loop in C makes it faster.  And besides, I see the speedup on
Linux, which doesn't have the problem you mention.

-roger

On Thu, Jul 31, 2008 at 10:53 AM, Henrik Bengtsson <[EMAIL PROTECTED]> wrote:
> Hi,
>
> On Thu, Jul 31, 2008 at 6:35 AM, Roger D. Peng <[EMAIL PROTECTED]> wrote:
>> Are the functions 'R_Unserialize' and 'R_InitFileInPStream' allowed to
>> be used in R packages?  I guess I'm just not clear on the implications
>> of this comment in 'Rinternals.h':
>>
>> /* The connection interface is not yet available to packages.  To
>>   allow limited use of connection pointers this defines the opaque
>>   pointer type. */
>>
>> I have a function in the 'filehash' package that unserializes a bunch
>> of objects from a file and it seems to run much faster in C than in R.
>>  But I don't want to release something that uses a non-public
>> function/interface.
>
> You say "much faster".  Could this be related to what I recently observed:
>
> July 24, 2008 thread '[Rd] serialize() to via temporary file is heaps
> faster than doing it directly (on Windows)' [
> https://stat.ethz.ch/pipermail/r-devel/2008-July/050256.html ].
>
> My $0.02
>
> /Henrik
>
>
>>
>> Thanks,
>> -roger
>> --
>> Roger D. Peng | http://www.biostat.jhsph.edu/~rpeng/
>>
>> ______
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



-- 
Roger D. Peng | http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R in a sandbox/jail

2008-12-08 Thread Roger D. Peng

I've not tried an automated system like you describe but I have tried to make 
test scripts available that compare the output of their programs to correct 
output.  That way the students can check their progress as they work.  In 
general, I find this approach doesn't work well for grading because in order to 
specify a homework to the point where the students' output matches your output, 
the homework ends up being 10 pages long (not unlike a legal document).


For example, in a number of instances I'd say "the function should return a 
number..." and then I'd compare the returned result to the actual answer.  But a 
student would 'print' the number instead, not knowing the difference between 
'return+autoprint' and 'print'.  I found it difficult to write a test script 
that gives partial credit for getting the answer right but the concept wrong. 
Perhaps that's a problem with the way I grade things, but I found it difficult 
to write an acceptable test script.


-roger

Barry Rowlingson wrote:

Someone recently suggested building a system for automatically testing
student's R programs. They would upload them to our Virtual Learning
Environment, which would then run the code on some inputs and see if
it got the right output. If it does, the student scores points for
that course.

My first thought was "you want to run unchecked, student-submitted
code on a server that has access to students' grades?".

Can this be done securely? The idea might be to run R in a
chroot-jail, freshly generated for each run. The jail would not be
able to access anything outside of it, and once the R session has
finished the calling process can pick up the output from within the
jail.

Maybe that's overkill. Perhaps if you run the user's code as an
ordinary user and store the answers/results in a directory only root
can read that would work (given no local root exploits). Other
precautions could include limiting the runtime or cputime for the R
session. It might be necessary to limit network access too.

 Anyone done anything like this? Personally I think there are too many
other problems with automated systems like this, particularly that
just because a program produces the correct output that makes it a
good one. Sure, at the production stage that's a requirement, but I'd
rather students learnt to program well than to program correctly -
since correctness follows goodness but goodness does not follow
correctness. But that's an argument for another day!

Barry

______
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
.



--
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] patch for 'merge' docs

2009-03-08 Thread Roger D. Peng

I've never quite understood the documentation for the 'all' argument
to 'merge'. I'm pretty sure using 'all = L' doesn't work but I'm open
to correction here. In any event, I've attached a patch.

-roger

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] patch for 'merge' docs

2009-03-09 Thread Roger D. Peng

Hmm, I see what you mean, and I'd be willing to accept that logic if I
could find a single other instance in the R documentation where that
shorthand was used. But I suppose this might be the only instance
where such a shorthand is necessary.

-roger

On Sun, Mar 8, 2009 at 9:30 PM, Kasper Daniel Hansen
 wrote:
> Roger
>
> (I think) L is shorthand for some logical value, ie. TRUE or FALSE. That has
> always been pretty clear to me. Your patch was stripped.
>
> Kasper
>
> On Mar 8, 2009, at 18:20 , Roger D. Peng wrote:
>
>> I've never quite understood the documentation for the 'all' argument
>> to 'merge'. I'm pretty sure using 'all = L' doesn't work but I'm open
>> to correction here. In any event, I've attached a patch.
>>
>> -roger
>>
>> --
>> Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] patch for 'merge' docs

2009-03-09 Thread Roger D. Peng

My patch was not particularly great. I think the Peter's alternative
makes (more) sense.

-roger

On Mon, Mar 9, 2009 at 8:47 AM, Peter Dalgaard  wrote:
> Roger D. Peng wrote:
>> Hmm, I see what you mean, and I'd be willing to accept that logic if I
>> could find a single other instance in the R documentation where that
>> shorthand was used. But I suppose this might be the only instance
>> where such a shorthand is necessary.
>
> Could you repeat what the patch was (presumably your mailer claimed that
> it was non-text)?
>
> The text could probably improved if it confuses the reader. Most of what
> it says is actually implied by the argument defaults (all.x=all,
> all.y=all), and it is not perfectly logical anyway (all sets the
> _default_ for all.x and all.y, but you can still set e.g. all=T, all.y=F).
>
> I think we could simplify the text to something like
>
>     all: logical; Provides a convenient way to set both 'all.x' and
>             'all.y' (defined below).
>
>> -roger
>>
>> On Sun, Mar 8, 2009 at 9:30 PM, Kasper Daniel Hansen
>>  wrote:
>>> Roger
>>>
>>> (I think) L is shorthand for some logical value, ie. TRUE or FALSE. That has
>>> always been pretty clear to me. Your patch was stripped.
>>>
>>> Kasper
>>>
>>> On Mar 8, 2009, at 18:20 , Roger D. Peng wrote:
>>>
>>>> I've never quite understood the documentation for the 'all' argument
>>>> to 'merge'. I'm pretty sure using 'all = L' doesn't work but I'm open
>>>> to correction here. In any event, I've attached a patch.
>>>>
>>>> -roger
>>>>
>>>> --
>>>> Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/
>>>> __
>>>> R-devel@r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>>
>>
>
>
> --
>   O__   Peter Dalgaard             Øster Farimagsgade 5, Entr.B
>  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
>  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
> ~~ - (p.dalga...@biostat.ku.dk)              FAX: (+45) 35327907
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Using and 'eval' and environments with active bindings

2009-03-15 Thread Roger D. Peng

The following code produces an error in current R-devel

f <- function(value) {
if(!missing(value))
100
else
2
}
e <- new.env()
makeActiveBinding("x", f, e)
eval(substitute(list(x)), e)

The error, after calling 'eval' is

Error in eval(expr, envir, enclos) :
  element 1 is empty;
   the part of the args list of 'list' being evaluated was:
   (x)


It has something to do with the change in R_isMissing in revision
r48118 but I'm not quite knowledgeable enough to understand what the
problem is. In R 2.8.1 the result was simply


> eval(substitute(list(x)), e)
[[1]]
[1] 2

I can't say I know what the output should be but I'd like some
clarification on whether this is a bug.

Thanks,
-roger
-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] corruption of data with serialize(ascii=TRUE)

2006-02-08 Thread Roger D. Peng

I noticed the following peculiarity with `serialize()' when `ascii = TRUE' is 
used.  In today's (svn r37299) R-devel, I get

 > set.seed(10)
 > x <- rnorm(10)
 >
 > a <- serialize(x, con = NULL, ascii = TRUE)
 > b <- unserialize(a)
 >
 > identical(x, b)  ## FALSE
[1] FALSE
 > x - b
  [1] -3.469447e-18  2.775558e-17 -4.440892e-16  0.00e+00  5.551115e-17
  [6] -5.551115e-17 -4.440892e-16  0.00e+00  2.220446e-16 -5.551115e-17


I expected `x' and `b' to be identical, which is what I get when `ascii = 
FALSE':

 > a <- serialize(x, con = NULL, ascii = FALSE)
 > b <- unserialize(a)
 >
 > identical(x, b)  ## TRUE
[1] TRUE


The same phenomenon occurs with `.saveRDS(ascii = TRUE)',

 > .saveRDS(x, file = "asdf", ascii = TRUE)
 > d <- .readRDS("asdf")
 >
 > identical(x, d)  ## FALSE
[1] FALSE
 >

Has anyone noticed this before?  I didn't see anything in the docs for 
`serialize()' that would indicate this behavior should be expected.

I'm on Linux Fedora Core 4.

-roger
-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] corruption of data with serialize(ascii=TRUE)

2006-02-09 Thread Roger D. Peng

Okay, I just wasn't sure of the source of the changes.  In retrospect, 
character 
and other vectors did serialize/unserialize to the original objects.

-roger

Prof Brian Ripley wrote:
> It is known (happens with save() too and did in earlier save formats). 
> Nothing particularly clever is done (the format is "%.16g\n") and 
> similarly as.character/parse are not inverses.
> 
> Perhaps more relevant is
> 
>> b/x -1
>  [1]  0.00e+00 -1.110223e-16  2.220446e-16  0.00e+00  0.00e+00
>  [6]  2.220446e-16  4.440892e-16  0.00e+00  2.220446e-16  0.00e+00
> 
> so the error (on my system) is about what you would expect from 
> floating-point computations.
> 
> There is a comment in serialize.c
> 
> /* 16: full precision; 17 gives 999, 000 &c */
> 
> which suggests that the format is optimized for size not maximal 
> possible accuracy.
> 
> Really all you have said is `floating point operations are subject to 
> rounding error'.
> 
> 
> On Wed, 8 Feb 2006, Roger D. Peng wrote:
> 
>> I noticed the following peculiarity with `serialize()' when `ascii = 
>> TRUE' is
>> used.  In today's (svn r37299) R-devel, I get
>>
>> > set.seed(10)
>> > x <- rnorm(10)
>> >
>> > a <- serialize(x, con = NULL, ascii = TRUE)
>> > b <- unserialize(a)
>> >
>> > identical(x, b)  ## FALSE
>> [1] FALSE
>> > x - b
>>  [1] -3.469447e-18  2.775558e-17 -4.440892e-16  0.00e+00  
>> 5.551115e-17
>>  [6] -5.551115e-17 -4.440892e-16  0.00e+00  2.220446e-16 
>> -5.551115e-17
>>
>>
>> I expected `x' and `b' to be identical, which is what I get when 
>> `ascii = FALSE':
>>
>> > a <- serialize(x, con = NULL, ascii = FALSE)
>> > b <- unserialize(a)
>> >
>> > identical(x, b)  ## TRUE
>> [1] TRUE
>>
>>
>> The same phenomenon occurs with `.saveRDS(ascii = TRUE)',
>>
>> > .saveRDS(x, file = "asdf", ascii = TRUE)
>> > d <- .readRDS("asdf")
>> >
>> > identical(x, d)  ## FALSE
>> [1] FALSE
>> >
>>
>> Has anyone noticed this before?  I didn't see anything in the docs for
>> `serialize()' that would indicate this behavior should be expected.
>>
>> I'm on Linux Fedora Core 4.
>>
>> -roger
>>
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] stopifnot() suggestion

2006-03-01 Thread Roger D. Peng

Wouldn't it be better to do something like

stopifnot(all(!is.na(x)), all(!is.na(y)), x, y)

rather than have stopifnot() go checking for NAs?  I agree the message is 
strange but if having non-NA values is really a condition, then why not just 
put 
it in the call to stopifnot()?

-roger

Dan Davison wrote:
> If an expression is passed to stopifnot() which contains missing values, 
> then the resulting error message is somewhat baffling until you are used 
> to it, e.g.
> 
>> x <- y <- rep(TRUE, 10)
>> y[7] <- NA
>> stopifnot(x, y)
> Error in if (!(is.logical(r <- eval(ll[[i]])) && all(r))) 
> stop(paste(deparse(mc[[i +  :
>   missing value where TRUE/FALSE needed
> 
> A minor change to stopifnot() produces the following behaviour:
> 
>> stopifnot(x, y)
> Error in stopifnot(x, y) : y contains missing values
> 
> My attempt at a suitable modification follows, and below that the original 
> function definition. Is a change along these lines appropriate?
> 
> ## Altered version
> 
> stopifnot <- function (...) {
>  n <- length(ll <- list(...))
>  if (n == 0)
>  return(invisible())
>  mc <- match.call()
>  for (i in 1:n) {
>  if(any(is.na(r <- eval(ll[[i]] stop(paste(deparse(mc[[i + 1]])), 
> " contains missing values")
>  if (!(is.logical(r) && all(r)))
>  stop(paste(deparse(mc[[i + 1]]), "is not TRUE"), call. = FALSE)
>  }
> }
> 
> 
> ## from R-2.1.1/src/library/base/R/stop.R
> 
> stopifnot <- function(...)
> {
>  n <- length(ll <- list(...))
>  if(n == 0)
>  return(invisible())
>  mc <- match.call()
>  for(i in 1:n)
>  if(!(is.logical(r <- eval(ll[[i]])) && all(r)))
>  stop(paste(deparse(mc[[i+1]]), "is not TRUE"), call. = FALSE)
> }
> 
> 
> Thanks,
> 
> Dan
> 
> 
>> version
>   _
> platform i386-pc-linux-gnu
> arch i386
> os   linux-gnu
> system   i386, linux-gnu
> status
> major2
> minor2.0
> year     2005
> month10
> day  06
> svn rev  35749
> language R
> 
> --
> Dan Davison
> Committee on Evolutionary Biology
> University of Chicago, U.S.A.
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] minor oddity in pdf() help page

2006-03-02 Thread Roger D. Peng

The following paragraph from ?pdf struck me as a bit odd:

  'pdf' writes uncompressed PDF.  It is primarily intended for
  producing PDF graphics for inclusion in other documents, and
  PDF-includers such as 'pdftex' are usually able to handle
  compression.

Should that be "...and PDF-includers such as 'pdftex' are usually _un_able to 
handle compression" ?

-roger
-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] minor oddity in pdf() help page

2006-03-02 Thread Roger D. Peng

Okay, it might be the early morning hour---when I read it a second time it made 
sense.

-roger

Prof Brian Ripley wrote:
> No, it means what it actually says.
> 
> If you include R's PDF in another application, the latter will usually 
> compress *if you asked the application for compressed PDF*.
> 
> On Thu, 2 Mar 2006, Hin-Tak Leung wrote:
> 
>> Roger D. Peng wrote:
>>> The following paragraph from ?pdf struck me as a bit odd:
>>>
>>>   'pdf' writes uncompressed PDF.  It is primarily intended for
>>>   producing PDF graphics for inclusion in other documents, and
>>>   PDF-includers such as 'pdftex' are usually able to handle
>>>   compression.
>>>
>>> Should that be "...and PDF-includers such as 'pdftex' are usually 
>>> _un_able to
>>> handle compression" ?
>>
>> Hmm, I think the documentation is correct but incomplete - pdftex *can*
>> handle compression, but compression is not implemented in R's pdf
>> output device. So it should say:
>>
>> "... PDF-includers such as 'pdftex' are usually able to handle
>> compression, but R's pdf device does not utilise that feature of pdf."
>>
>> (I have checked a pdf generated by R, and it doesn't compress, and I was
>> using pdflatex this morning to include a compressed pdf, so both
>> parts are correct).
>>
>> There is a caveat: the PDF specs (and the postscript language standard)
>> actually defines a few stream compression schemes - LZW and deflate
>> are two I know of from the top of my head, I think there are more.
>> But LZW used to be tangled up with the Unisys patent until recently
>> when the patent expired, so most open-source softwares won't do
>> it. deflate is implemented in zlib and ghostscript-written pdf
>> usually have stream compression on. i.e. For some purposes such
>> as getting smaller pdf's, it may be better to output from R
>> postscript and use ghostscript to do ps2pdf rather than doing
>> it directly from R, and to be pedantic, pdftex can only handle
>> deflate encoded compression, AFAIK, for the reason I outlined above,
>> but it is sufficient for most purposes, since most tools cannot
>> generate LZW-compressed pdf's.
>>
>> HTL
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] compress defaults for save() and save.image()

2006-03-30 Thread Roger D. Peng

Prof Brian Ripley wrote:
> I have changed the default in save() to compress = !ascii.  This seems 
> quite safe, as almost always save() is called explicitly and people will
> appreciate that it might take a little time to save large objects (and 
> depending on your system, compression could even be faster).

I'm in favor of such a change.  I almost always explicitly set `compress = 
TRUE'.  When I don't use it, it's because I want to be able to load large 
objects quickly and I've noticed that loading uncompressed workspaces can be 
quite a bit faster.  Usually though, the savings in disk space is worth the 
small penalty in loading time.

> 
> Should we also change the default in save.image()?  That is almost always 
> used implicitly, via q(), a menu   There are arguments that it is a 
> more serious change so should not be done at the end of the release cycle, 
> and also that large .RData files are something people would want to avoid.

I rarely use `save.image()' except to occasionally dump data during a long run 
for crash recovery purposes.  I don't think changing the defaults would make a 
difference to me.

> 
> BTW, the defaults can be changed via options() (see ?save): has anyone 
> ever found that useful?

I was not even aware of this!

> 
> And whilst I am feeling curious, has anyone used save(ascii = TRUE) in 
> recent years?
> 

I don't think I've ever used this feature.

-roger

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Return function from function with minimal environment

2006-04-04 Thread Roger D. Peng

In R 2.3.0-to-be, I think you can do

foo <- function(huge) {
scale <- mean(huge)
g <- function(x) { scale * x }
environment(g) <- emptyenv()
g
}

-roger

Henrik Bengtsson wrote:
> Hi,
> 
> this relates to the question "How to set a former environment?" asked
> yesterday.  What is the best way to to return a function with a
> minimal environment from a function? Here is a dummy example:
> 
> foo <- function(huge) {
>   scale <- mean(huge)
>   function(x) { scale * x }
> }
> 
> fcn <- foo(1:10e5)
> 
> The problem with this approach is that the environment of 'fcn' does
> not only hold 'scale' but also the memory consuming object 'huge',
> i.e.
> 
> env <- environment(fcn)
> ll(envir=env)  # ll() from R.oo
> #   member data.class dimension object.size
> # 1   hugenumeric   100 428
> # 2  scalenumeric 1  36
> 
> save(env, file="temp.RData")
> file.info("temp.RData")$size
> # [1] 2007624
> 
> I generate quite a few of these and my 'huge' objects are of order
> 100Mb, and I want to keep memory usage as well as file sizes to a
> minimum.  What I do now, is to remove variable from the local
> environment of 'foo' before returning, i.e.
> 
> foo2 <- function(huge) {
>   scale <- mean(huge)
>   rm(huge)
>   function(x) { scale * x }
> }
> 
> fcn <- foo2(1:10e5)
> env <- environment(fcn)
> ll(envir=env)
> #   member data.class dimension object.size
> # 1  scalenumeric 1  36
> 
> save(env, file="temp.RData")
> file.info("temp.RData")$size
> # [1] 156
> 
> Since my "foo" functions are complicated and contains many local
> variables, it becomes tedious to identify and remove all of them, so
> instead I try:
> 
> foo3 <- function(huge) {
>   scale <- mean(huge);
>   env <- new.env();
>   assign("scale", scale, envir=env);
>   bar <- function(x) { scale * x };
>   environment(bar) <- env;
>   bar;
> }
> 
> fcn <- foo3(1:10e5)
> 
> But,
> 
> env <- environment(fcn)
> save(env, file="temp.RData");
> file.info("temp.RData")$size
> # [1] 2007720
> 
> When I try to set the parent environment of 'env' to emptyenv(), it
> does not work, e.g.
> 
> fcn(2)
> # Error in fcn(2) : attempt to apply non-function
> 
> but with the new.env(parent=baseenv()) it works fine. The "base"
> environment has the empty environment as a parent.  So, I try to do
> the same myself, i.e. new.env(parent=new.env(parent=emptyenv())), but
> once again I get
> 
> fcn(2)
> # Error in fcn(2) : attempt to apply non-function
> 
> Apparently, I do not understand enough here.  Please, enlighten me. In
> the meantime I stick with foo2().
> 
> Best,
> 
> Henrik
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] request to add argv[0]

2006-04-04 Thread Roger D. Peng



ivo welch wrote:
> Dear R Developers:  This has come up repeatedly in the r-help mailing
> list, most recently in a thread started by myself.  The answers have
> been changing over the years.  Would it be possible and easy for R to
> offer a global read-only option that gives the name of the currently
> executing R script, i.e., the equivalent of argv[0] in C?

Isn't that just `commandArgs()[1]'?  I must be misunderstanding your question.

> 
>   (PS: An even better mechanism would be the ability to pick off multiple
>arguments following the .R file, different from commandArgs(),
>but this is not as important and probably more difficult as it would change
>the invokation syntax of R.)
> 
> sincerely,
> 
> /ivo welch
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R_PAPERSIZE and LC_PAPER

2006-04-20 Thread Roger D. Peng

Papersize can be set at compile time in the 'config.site' file (R_PAPERSIZE).

-roger

Marc Schwartz (via MN) wrote:
> Prof. Ripley,
> 
> Happy to help.
> 
> So, it sounds like we are thinking along the same lines then.
> 
> A couple of follow up questions:
> 
> 1. Is R_PAPERSIZE_DEFAULT to be the proposed new compile time setting in
> 2.4.0? Unless I missed it, I did not see it documented anywhere (ie.
> R-admin/NEWS for 2.2.1 patched or 2.3.0 devel) and it is not in the
> configure related files that I have here.
> 
> 2. For LC_ALL, it is not set (at least on my FC4 system, have not had
> the time yet to go to FC5) in en_US.UTF-8. Is it set in other locales
> such that it would be of value?
> 
> 
> Thanks also for the pointer to the devel guidelines. I had read through
> them at some point in the past, but it has been a while.
> 
> Regards,
> 
> Marc
> 
> On Thu, 2006-04-20 at 18:48 +0100, Prof Brian Ripley wrote:
>> Marc,
>>
>> Thanks for the comments.  The 2.3.x series is in feature freeze, and 
>> although a few features do break though for patch releases, they had 
>> better be `badly needed' see 
>> http://developer.r-project.org/devel-guidelines.txt).
>>
>> So I was thinking of 2.4.0.
>>
>> My suggestion was going to be along the lines of
>>
>> local({
>> papersize <- as.vector(Sys.getenv("R_PAPERSIZE"))
>> if(!nchar(papersize)) {
>>  lcpaper <- Sys.getlocale("LC_PAPER")
>>  if(nchar(lcpaper))
>>  papersize <- if(length(grep(, lcpaper)) > 0) "letter" else "a4"
>>  else papersize <- as.vector(Sys.getenv("R_PAPERSIZE_DEFAULT"))
>> }
>> options(papersize = papersize)
>> })
>>
>> This is unchanged if LC_PAPER is unset.  For those with LC_PAPER set,
>> its value takes precedence over the compile-time default.  That's almost 
>> exactly equivalent to what happens on Windows (which sets LC_MONETARY for 
>> this purpose, as LC_PAPER is not a locale category there).
>>
>> Now, one could argue that if LC_PAPER is unset it should default to 
>> LC_ALL, but I think is less desirable.
>>
>> Of course, at present Sys.getlocale("LC_PAPER") is not supported, so 
>> that's part of the TODO.
>>
>> Brian
>>
>>
>> On Thu, 20 Apr 2006, Marc Schwartz (via MN) wrote:
>>
>>> On Thu, 2006-04-20 at 08:09 +0100, Prof Brian Ripley wrote:
>>>> R uses the environment variable R_PAPERSIZE to set its papersize, e.g. for
>>>> postscript.
>>>>
>>>> It seems the modern way is to via LC_PAPER, e.g.
>>>>
>>>> http://mail.nl.linux.org/linux-utf8/2002-05/msg00010.html
>>>>
>>>> and Googling will show that people expect this to work.
>>>>
>>>> However, that is not set on my FC3 system, and it would affect people who
>>>> use en_US as their locale in, say, Austria.
>>>>
>>>> Should we be making use of LC_PAPER, or would it just cause further
>>>> complications?  (On Windows, the locale name is used to set the default
>>>> papersize, but there it is unlikely to be set inappropriately.)
>>>
>>> Here's my 0.0162 Euros (at current conversion rates):
>>>
>>> For R 2.4.0, announce that LC_PAPER will become the default environment
>>> variable used to set the default R papersize and then not set
>>> R_PAPERSIZE by default (ie. in build scripts, etc.)
>>>
>>> However, If someone sets R_PAPERSIZE in their site or local profile,
>>> this will supercede the LC_PAPER setting. This would allow for a R
>>> setting that may need to be different than the system default.
>>>
>>> Doing this for 2.4.0 (as opposed to 2.3.x) would give folks notice and
>>> time to consider the impact on their local installations and code, while
>>> enabling future users to take advantage of the standard.
>>>
>>> I think that in general, R should abide by published standards unless
>>> there are very compelling reasons not to.
>>>
>>> HTH,
>>>
>>> Marc Schwartz
>>>
>>>
>>>
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] difference in rbind() [print?] behavior between 2.2.1 and 2.3.0

2006-04-27 Thread Roger D. Peng

I'm a little confused by a change in behavior from 2.2.1 to 2.3.0.  In 2.2.1 I 
could do

 > ## Create a list of data frames in 2.2.1
 > b <- list(x = data.frame(a = 1, b = 2), y = data.frame(a = 1, b = 2))
 > do.call("rbind", b)
   a b
x 1 2
y 1 2

But in 2.3.0 I get

 > do.call("rbind", b)
Error in data.frame(a = c("1", "1"), b = c("2", "2"), check.names = FALSE,  :
 row names contain missing values

Traceback indicates that the error is actually in the print method.

 > d <- do.call("rbind", b)
 > d
Error in data.frame(a = c("1", "1"), b = c("2", "2"), check.names = FALSE,  :
 row names contain missing values

But:

 > d[1:2, ]
 a b
NA  1 2
NA1 1 2
 >

I'm not sure those are the intended row names but I'm not sure.  The following 
does seem to work as I would have expected:

 > b <- list(x = data.frame(a = 1:2, b = 2:3), y = data.frame(a = 1:2, b = 
 > 2:3)) 
 > do.call("rbind", b)
 a b
x.1 1 2
x.2 2 3
y.1 1 2
y.2 2 3
 >


-roger
-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] delayedAssign and interrupts

2006-05-19 Thread Roger D. Peng

I noticed something recently that I thought was odd:

delayedAssign("x", { Sys.sleep(5); 1 })
x  ## Hit Ctrl-C within the first second or 2

gives me:

 > delayedAssign("x", { Sys.sleep(5); 1 })
 > x  ## Hit Ctrl-C within the first second or two

 > x
Error: recursive default argument reference
 >

My only problem here is that now I'm stuck---there's no way to recover whatever 
'x' was supposed to be (i.e. 1).

In reality, I want 'x' to be a promise to load a moderately large data object. 
But if I (or a user) Ctrl-C's during the load I'll have to start from scratch. 
Is there anyway to recover the promise (or the value of the expression) in case 
of an interrupt?

-roger
-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] delayedAssign and interrupts

2006-05-19 Thread Roger D. Peng



Luke Tierney wrote:
> On Fri, 19 May 2006, Duncan Murdoch wrote:
> 
>> On 5/19/2006 9:54 AM, Roger D. Peng wrote:
>>> I noticed something recently that I thought was odd:
>>>
>>> delayedAssign("x", { Sys.sleep(5); 1 })
>>> x  ## Hit Ctrl-C within the first second or 2
>>>
>>> gives me:
>>>
>>> > delayedAssign("x", { Sys.sleep(5); 1 })
>>> > x  ## Hit Ctrl-C within the first second or two
>>>
>>> > x
>>> Error: recursive default argument reference
>>> >
>>>
>>> My only problem here is that now I'm stuck---there's no way to 
>>> recover whatever
>>> 'x' was supposed to be (i.e. 1).
>>>
>>> In reality, I want 'x' to be a promise to load a moderately large 
>>> data object.
>>> But if I (or a user) Ctrl-C's during the load I'll have to start from 
>>> scratch.
>>> Is there anyway to recover the promise (or the value of the 
>>> expression) in case
>>> of an interrupt?
>>
>> I don't know of one.  Normally substitute(x) is supposed to retrieve the
>>  promise expression, but by a strange quirk of history, it does not
>> work when x is in .GlobalEnv.
>>
>> I'd say the behaviour you're seeing is a bug.  If I do
>>
>> > x <- 2
>> > x <- {Sys.sleep(1); 1}  # Break before complete
>>
>> > x
>> [1] 2
>>
>> nothing is changed about x.  I would think the same thing should happen
>> when x is a promise:  if the evaluation of the promised expression
>> fails, the promise should not be changed.
> 
> I don't think this is a clear as you make it out--given that these
> uses of promises often have side effects, and some of those side
> effects may have occurred prior to an error, it isn't clear that
> pretending like no evaluation had happened is the right way to go.
> 
> It should not be too hard to write a delayedAssignmentReset function
> if that is really useful; alternatively a user of delayedAssign should
> be able to arrange via tryCatch to chatch interrupts and re-install
> the delayed assignment if one occurs.

This was my original thought, and I think it would be possible to use tryCatch 
to reinstall the delayed assignment.  However, I couldn't figure out how to 
have 
the reinstalled expression to be able to catch interrupts without getting into 
some infinite expression  Perhaps I need to investigate it further.

> 
> It might not be a bad idea for us to look into the promise evaluation
> internals and see if we should/can separate the promise black-holing
> from detection of recursive default argument references to get more
> reasonable error messages in these situations and maybe allow
> resetting more gnerally.  But anything done here had better keep
> efficiency in mind since this is prety core to R function call
> evaluation.  I may try to look into this when I get back to workign on
> R internals.
> 
> luke
> 
> 
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] memory profiling

2006-05-26 Thread Roger D. Peng

I'm interested in playing around with memory profiling in R-devel (as described 
at http://developer.r-project.org/memory-profiling.html) and was trying to 
figure out how to compile R-devel so that I can use the 'tracemem()' function. 
But I can't figure out how/where to set R_MEMORY_PROFILING.  Is it on the 
configure command line?

Thanks for any help,
-roger
-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] normalizePath() warning

2006-06-12 Thread Roger D. Peng

I've been getting the following warning after running 'install.packages()' 
recently:

Warning message:
insufficient OS support on this platform in: normalizePath(path)

Does anyone know what this means?  And does anyone know how I can get rid of 
the 
warning?  I've just installed R on a fresh FC5 system so I feel I might have 
forgotten to install a package/library or something.

Thanks,
-roger

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] normalizePath() warning

2006-06-13 Thread Roger D. Peng



Prof Brian Ripley wrote:
> On Mon, 12 Jun 2006, Duncan Murdoch wrote:
> 
>> On 6/12/2006 4:53 PM, Roger D. Peng wrote:
>>> I've been getting the following warning after running 
>>> 'install.packages()' recently:
>>>
>>> Warning message:
>>> insufficient OS support on this platform in: normalizePath(path)
>>>
>>> Does anyone know what this means?  And does anyone know how I can get 
>>> rid of the
>>> warning?  I've just installed R on a fresh FC5 system so I feel I 
>>> might have
>>> forgotten to install a package/library or something.
>>
>> This is printed when your R source was compiled without these defines:
>>
>> #if defined(HAVE_GETCWD) && defined(HAVE_REALPATH)
>>
>> I guess you can look in your config log to see why you don't have those.
> 
> This worked for me on a vanilla FC5 system.  I guess the issue is likely 
> to be realpath, whose definition is enclosed by
> 
> #if defined __USE_BSD || defined __USE_XOPEN_EXTENDED
> #endif
> 
> so I would review the compiler options used:  -std=gnu99 is recommended 
> (and selected by default).  I do get the warning with -std=c89 and 
> -std=c99.

Indeed, my modification of CFLAGS was the problem.  However, if I may offer a 
meager defense, the comments for CFLAGS in 'config.site' say:

If unspecified, defaults to '-g -O2 -std=c99' for gcc

-roger

> 
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Compiling R with ACML on FC5, gcc 4.1.1

2006-06-19 Thread Roger D. Peng

I'm trying to compile R with the AMD core math library (ACML) on Fedora Core 5 
with gcc 4.1.1 and am having difficulty getting R to recognize it.  I'm using 
the file

acml-3-1-0-gfortran-64bit.tgz

from the AMD website.  On Fedora Core 4 I had no trouble getting this to work 
(with gcc 4.0.2) but I seem to be missing something now.  Has anyone had any 
problems and/or luck?

Thanks for any tips,
-roger
-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] how to start local scope?

2006-07-27 Thread Roger D. Peng

Try

f <- function() {
local({
b <- 1
})
b + 2
}


Tamas K Papp wrote:
> How can I start a local scope inside an R function?  Eg in
> 
>> b
> Error: object "b" not found
>> f <- function() {
> +   {
> + b <- 1
> +   }
> +   b+2
> + }
>> f()
> [1] 3
> 
> I would like f() to report an error, not finding b.  I am thinking
> about something like let in Scheme/Lisp.
> 
> Thanks,
> 
> Tamas
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] S4 classes and objects -- fixed structure? No...

2006-08-28 Thread Roger D. Peng

I think you're right---this shouldn't happen in theory, but it does because of 
the internal representation of S4 objects in R.

In R devel (to be 2.4.0), this changes and I believe your example will no 
longer 
work.

-roger

Jörg Beyer wrote:
> Hello.
> 
> Suppose you define a new S4-class, say
>>  setClass("track", representation(x="numeric", y="numeric"))
> 
> Don't worry if you have a deja vu, it's from the help page.
> Your new class is said to have a fixed structure: two slots, x, and y,
> and that should apply to all objects you construct as members of that class.
>>  tr <- new( "track" )
> 
> Now do the following:
>>  tr[ "ping" ] <- "pong"
>>  tr$bingo <- "bongo"
>>  tr[[ "blaa" ]] <- c( 200, 300 )
> 
> Of course you can use the well known operators to access these "list entries
> in a S4-class object":
>>  tr[ "blaa" ]
>>  etc. 
> 
> You see what can happen if you decide to do a bit stress testing. The
> question is not whether my examples makes sense or not. The question is if
> these examples should be possible at all.
> I wonder which is true,
> -- the theory (Chambers, 1998, p. 279ff; Venables and Ripley, 2000, p. 99ff:
> "... All objects in a [S4] class must have the same structure. ..."; etc.)
> -- or the actual implementation in R (see my example, which successfully
> violates the design principles of the language)
> 
> Bug, or feature? Any clarifications are appreciated -- it may be my
> half-cooked knowledge that I find this confusing and dangerous.
> 
> Best 
> 
> Joerg Beyer
> 
> P.S.: 
> Oh, the specs: PowerMac G4/400 PCI -- 1GB RAM -- Mac OS X 10.4.6 -- R 2.2.1
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] serialize changes for 2.4.0

2006-09-05 Thread Roger D. Peng

I noticed today that in R 2.3.1, I get

 > serialize(list(1,2,3), NULL, ascii = TRUE)
[1] "A\n2\n131841\n131840\n19\n3\n14\n1\n1\n14\n1\n2\n14\n1\n3\n"
 >

but in R 2.4.0 alpha I get

 > serialize(list(1,2,3), NULL, ascii = TRUE)
  [1] 41 0a 32 0a 31 33 32 30 39 36 0a 31 33 31 38 34 30 0a 31 39 0a 33 0a 31 34
[26] 0a 31 0a 31 0a 31 34 0a 31 0a 32 0a 31 34 0a 31 0a 33 0a
 >

It seems I need to use 'rawToChar()' to get the character vector that I used to 
get in R 2.3.1.

Is this intentional?  I couldn't find any mention of this change in the NEWS 
file; from the docs, it seems to me that either return value could be correct.

 > version
_
platform   x86_64-unknown-linux-gnu
arch   x86_64
os linux-gnu
system x86_64, linux-gnu
status alpha
major  2
minor  4.0
year   2006
month  09
day05
svn rev39121
language   R
version.string R version 2.4.0 alpha (2006-09-05 r39121)
 >

-roger
-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] standardization of slot access

2006-09-26 Thread Roger D. Peng

I think slot() is only necessary in the case where you have

a <- "myslot"
slot(object, a)

which is equivalent to

[EMAIL PROTECTED]

It seems unlikely that you would use slot() during interactive use, but perhaps 
more so in programming.  Even still, I rather infrequently find the need for 
slot() because classes have defined slot names---if you're going to access a 
slot, just use the name that you gave it in the class definition since that 
doesn't change from instance to instance.

-roger

Sebastian P. Luque wrote:
> Hi,
> 
> I'm usually confused about when to use 'slot' or '@'.  I've frequently
> read that it's always preferable to use accessor functions, so I would
> think the '@' operator should be avoided.  However, ?slot contains the
> following advise:
> 
> 
> "Generally, the only reason to use the functional form rather than the
> simpler operator is _because_ the slot name has to be computed."
> 
> 
> How do we decide whether to use the function or the operator?
> 
> 
> Cheers,
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Error condition in evaluating a promise

2006-10-18 Thread Roger D. Peng

I've encountered a (I think) related problem when using promises to load 
relatively large datasets.  For example something like

delayedAssign("x", getBigDataset())

runs into the same problem if you hit Ctrl-C while 'x' is being evaluated for 
the first time.  Afterwards, there's no way to retrieve the dataset associated 
with 'x'.

Active bindings work in this case, but the problem is that I usually only want 
to load a large dataset once.

-roger

Robert Gentleman wrote:
> 
> Simon Urbanek wrote:
>> Seth,
>>
>> thanks for the suggestions.
>>
>> On Oct 18, 2006, at 11:23 AM, Seth Falcon wrote:
>>
>>> Simon Urbanek <[EMAIL PROTECTED]> writes:
>>>> thanks, but this is not what I want (the symbols in the environment
>>>> are invisible outside) and it has nothing to do with the question I
>>>> posed: as I was saying in the previous e-mail the point is to have
>>>> exported variables in a namespace, but their value is known only
>>>> after the namespace was attached (to be precise I'm talking about
>>>> rJava here and many variables are valid only after the VM was
>>>> initialized - using them before is an error).
>>> We have a similar use case and here is one workaround:
>>>
>>> Define an environment in your name space and use it to store the
>>> information that you get after VM-init.
>>>
>>> There are a number of ways to expose this:
>>>
>>> * Export the env and use vmEnv$foo
>>>
>>> * Provide accessor functions, getVmFooInfo()
>>>
>>> * Or you can take the accessor function approach a bit further to make
>>>   things look like a regular variable by using active bindings.  I can
>>>   give more details if you want.  We are using this in the BSgenome
>>>   package in BioC.
>>>
>> I'm aware of all three solutions and I've tested all three of them  
>> (there is in fact a fourth one I'm actually using, but I won't go  
>> into detail on that one ;)). Active bindings are the closest you can  
>> get, but then the value is retrieved each time which I would like to  
>> avoid.
>>
>> The solution with promises is very elegant, because it guarantees  
>> that on success the final value will be locked. It also makes sense  
>> semantically, because the value is determined by code bound to the  
>> variable and premature evaluation is an error - just perfect.
>>
>> Probably I should have been more clear in my original e-mail - the  
>> question was not to find a work-around, I have plenty of them ;), the  
>> question was whether the behavior of promises under error conditions  
>> is desirable or not (see subject ;)). For the internal use of  
>> promises it is irrelevant, because promises as function arguments are  
>> discarded when an error condition arises. However, if used in the  
>> "wild", the behavior as described would be IMHO more useful.
>>
> 
>Promises were never intended for use at the user level, and I don't 
> think that they can easily be made useful at that level without exposing 
> a lot of stuff that cannot easily be explained/made bullet proof.  As 
> Brian said, you have not told us what you want, and I am pretty sure 
> that there are good solutions available at the R level for most problems.
> 
>Although the discussion has not really started, things like dispatch 
> in the S4 system are likely to make lazy evaluation a thing of the past 
> since it is pretty hard to dispatch on class without knowing what the 
> class is. That means, that as we move to more S4 methods/dispatch we 
> will be doing more evaluation of arguments.
> 
> best wishes
>   Robert
> 
> 
>> Cheers,
>> Simon
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Is there a version of RMySQL for Windows?

2006-11-27 Thread Roger D. Peng

There are some pre-built binaries here:

http://stat.bell-labs.com/RS-DBI/download/index.html

-roger

Michal Okoniewski wrote:
> I was trying to email directly the developer, David A. James, but all
> the emails bounce...
> Does anyone know if RMySQL  may be re-compiled under Windows
> and what are the limitations?
>  
> Cheers, 
> Michal
>  
> 
> 
>  
> This email is confidential and intended solely for the use o...{{dropped}}
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Trailing message on R CMD BATCH

2007-01-10 Thread Roger D. Peng

I do find the timing useful and would prefer it to be retained unless it causes 
a problem elsewhere.

-roger

Brian Ripley wrote:
> Unix versions of R CMD BATCH have reported proc.time() unless the script 
> ends in q().  E.g. if the input is 'search()' the output is
> 
>> invisible(options(echo = TRUE))
>> search()
> [1] ".GlobalEnv""package:stats" "package:graphics"
> [4] "package:grDevices" "package:utils" "package:datasets"
> [7] "package:methods"   "Autoloads" "package:base"
>> proc.time()
> [1] 1.053 0.067 1.109 0.000 0.000
> 
> This was undocumented, and not shared by the Windows version.
> 
> Is it useful?
> Do people want it retained?
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] How to evaluate an lm() object for generating warning statement in a function

2007-01-23 Thread Roger D. Peng

There are two options, you can check to see if 'object.lm' inherits from "lm" 
using 'inherits()' or make comp.var.estimates a generic and then make a method 
for "lm" objects.

-roger

Michael Rennie wrote:
> Hi, there
> 
> I tried this post on R-help but did not generate any replies, so I thought 
> I might try the waters here since it's a more programming-based group.
> 
> I have written a function that will allow me to calculate the variance
> components for a model II (random effects) single factor ANOVA (perhaps 
> this is
> me re-inventing the wheel, but I handn't yet come across anything in R that
> does this for me, but I'm sure people will let me know from this posting if
> there are). The function takes as it's argument an object which holds the
> results of an lm call. e.g.,
> 
> object1<-lm(y~x)
> 
> The function is as follows (some comments included):
> 
> comp.var.estimates<-function(object.lm)
>  {
>  anovmod<-anova(object.lm) #get the anova table for the lm
>  MStreat<-anovmod[1,3]; MSErr<-anovmod[2,3] #extract Mean 
> Squares
>  dataframe<- as.data.frame(object.lm[13]) #gets the data 
> that went into the lm
>  ni <- tapply(dataframe[,1], dataframe[,2], length) #number 
> of cases per treatment level
>  nisq<-ni^2
>  no<-(1/(length(ni)-1))*(sum(ni)-(sum(nisq)/sum(ni))) 
> #required to calculate variance components
>  s2a<-((MStreat-MSErr)/no)
>  stot<-s2a + MSErr
>  treatvar<-s2a/stot*100  #calculate variance components as 
> a percentage of the total
>  errorvar<-MSErr/stot*100
>  list(treat.var.comp=s2a, 
> err.var.comp=MSErr,
>  p.var.treat=treatvar, p.var.err=errorvar)
>  }
> comp.var.estimates(object1)
> 
> I'd like to include a "warning" statement in the function that
> returns something like 'function requires arguments that are objects of the 
> form obj<-lm
> (y~x)', but I need a case to evaluate the object against in order to throw the
> warning.
> 
> Any advice?
> 
> I feel like my best opportunity is after the first line into the function,
> where I ask for the ANOVA table. Since this is a 2 X 5 table, presumably I
> should be able to evaluate it against the size of that table? Any thoughts on
> how to check that? I welcome any suggestions.
> 
> Cheers,
> 
> Mike
> 
> 
> 
> 
> Michael Rennie
> Ph.D. Candidate, University of Toronto at Mississauga
> 3359 Mississauga Rd. N.
> Mississauga, ON  L5L 1C6
> Ph: 905-828-5452  Fax: 905-828-3792
> www.utm.utoronto.ca/~w3rennie 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] [PATCH] Typo in 'help' documentation

2008-04-30 Thread Roger D. Peng [audrey]

---
I think this should be "package is loaded" and not "library is
loaded".  At least, I can't see how it can be correct the way it's
currently written.

-roger

 src/library/utils/man/help.Rd |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git src/library/utils/man/help.Rd src/library/utils/man/help.Rd
index 5b0ed8e..783b0c2 100644
--- src/library/utils/man/help.Rd
+++ src/library/utils/man/help.Rd
@@ -135,7 +135,7 @@ type?topic
 \note{
   Unless \code{lib.loc} is specified explicitly, the loaded packages are
   searched before those in the specified libraries.  This ensures that
-  if a library is loaded from a library not in the known library trees,
+  if a package is loaded from a library not in the known library trees,
   then the help from the loaded library is used.  If \code{lib.loc} is
   specified explicitly, the loaded packages are \emph{not} searched.
 
-- 
1.5.5.1.99.gf0ec4

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

62 matches

Mail list logo