[Rd] Behavior or as.environment in function arguments/call (and force() behaviors...)

2013-01-01 Thread Jeff Ryan
Happy 2013!

Can someone with more knowledge of edge case scoping/eval rules explain
what is happening below?  Happens in all the versions of R I have on hand.

Behavior itself is confusing, but ?as.environment also provides no clue.
 The term used in that doc is 'search list', which is ambiguous, but the
see also section mentions search(), so I would *think* that is what is
intended.  Either way Fn1() below can't really be explained.

Major question is what in the world is Fn1 doing, and why is Fn2 not equal
to Fn3? [ Fn3/Fn4 are doing what I want. ]


Fn1 <- function(x="test",pos=-1,env=as.environment(pos)) {
ls(env)
}

Fn2 <- function(x="test",pos=-1,env=as.environment(pos)) {
force(env)
ls(env)
}

Fn3 <- function(x="test",pos=-1,env=as.environment(pos)) {
# should be the same as force() in Fn2, but not
# ?force
# Note:
#
#This is semantic sugar: just evaluating the symbol will do the
#same thing (see the examples).
env
ls(env)
}

Fn4 <- function(x="test",pos=-1,env=as.environment(pos)) {
# same as Fn3
env <- env
ls(env)
}

Fn1()
Fn2()
Fn3()
Fn4()
ls()

## output #
> Fn1()
[1] "doTryCatch" "expr"   "handler""name"   "parentenv"

> Fn2()
[1] "env" "pos" "x"

> Fn3()
[1] "Fn1" "Fn2" "Fn3" "Fn4"

> Fn4()
[1] "Fn1" "Fn2" "Fn3" "Fn4"

### .GlobalEnv
> ls()
[1] "Fn1" "Fn2" "Fn3" "Fn4"

> R.version
   _
platform   x86_64-apple-darwin11.2.0
arch   x86_64
os darwin11.2.0
system x86_64, darwin11.2.0
status
major  2
minor  15.1
year   2012
month  06
day22
svn rev59600
language   R
version.string R version 2.15.1 (2012-06-22)
nickname   Roasted Marshmallows

> R.version
   _
platform   x86_64-apple-darwin11.2.0
arch   x86_64
os darwin11.2.0
system x86_64, darwin11.2.0
status Under development (unstable)
major  3
minor  0.0
year   2012
month  12
day28
svn rev61464
language   R
version.string R Under development (unstable) (2012-12-28 r61464)
nickname   Unsuffered Consequences

-- 
Jeffrey Ryan
jeffrey.r...@lemnica.com

www.lemnica.com

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Behavior or as.environment in function arguments/call (and force() behaviors...)

2013-01-01 Thread Duncan Murdoch

On 13-01-01 4:35 PM, Jeff Ryan wrote:

Happy 2013!

Can someone with more knowledge of edge case scoping/eval rules explain
what is happening below?  Happens in all the versions of R I have on hand.


Even though it is used as a default in a number of places, the pos==-1 
value is really poorly documented.  You need to look in the source, in 
particular src/main/envir.c, function pos2env.  There you'll see that 
pos==-1 is special cased to be the environment from which pos.to.env
(or as.environment in your case) was called.  For non-negative values, 
it indexes the search list (i.e. the list returned by search().)  Other 
values are errors.


The trouble in your examples is that this location varies.  In Fn1, it 
is being called in the ls() call.  In Fn2, it is in the force() call. 
In Fn3 and Fn4, it's the Fn3/Fn4 call.


In spite of what the docs say in ?get, I would rarely if ever use a pos 
argument to as.environment.  Use an environment and pass it as envir.


Duncan Murdoch



Behavior itself is confusing, but ?as.environment also provides no clue.
  The term used in that doc is 'search list', which is ambiguous, but the
see also section mentions search(), so I would *think* that is what is
intended.  Either way Fn1() below can't really be explained.

Major question is what in the world is Fn1 doing, and why is Fn2 not equal
to Fn3? [ Fn3/Fn4 are doing what I want. ]


Fn1 <- function(x="test",pos=-1,env=as.environment(pos)) {
ls(env)
}

Fn2 <- function(x="test",pos=-1,env=as.environment(pos)) {
force(env)
ls(env)
}

Fn3 <- function(x="test",pos=-1,env=as.environment(pos)) {
# should be the same as force() in Fn2, but not
# ?force
# Note:
#
#This is semantic sugar: just evaluating the symbol will do the
#same thing (see the examples).
env
ls(env)
}

Fn4 <- function(x="test",pos=-1,env=as.environment(pos)) {
# same as Fn3
env <- env
ls(env)
}

Fn1()
Fn2()
Fn3()
Fn4()
ls()

## output #

Fn1()

[1] "doTryCatch" "expr"   "handler""name"   "parentenv"


Fn2()

[1] "env" "pos" "x"


Fn3()

[1] "Fn1" "Fn2" "Fn3" "Fn4"


Fn4()

[1] "Fn1" "Fn2" "Fn3" "Fn4"

### .GlobalEnv

ls()

[1] "Fn1" "Fn2" "Fn3" "Fn4"


R.version

_
platform   x86_64-apple-darwin11.2.0
arch   x86_64
os darwin11.2.0
system x86_64, darwin11.2.0
status
major  2
minor  15.1
year   2012
month  06
day22
svn rev59600
language   R
version.string R version 2.15.1 (2012-06-22)
nickname   Roasted Marshmallows


R.version

_
platform   x86_64-apple-darwin11.2.0
arch   x86_64
os darwin11.2.0
system x86_64, darwin11.2.0
status Under development (unstable)
major  3
minor  0.0
year   2012
month  12
day28
svn rev61464
language   R
version.string R Under development (unstable) (2012-12-28 r61464)
nickname   Unsuffered Consequences



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Behavior or as.environment in function arguments/call (and force() behaviors...)

2013-01-01 Thread Jeff Ryan
Thanks Duncan.

I was hoping that pos= would be more useful, but envir is a bit easier to
grasp in terms of consistency.

The lazy eval was also pointed out to me off-list as a source of the
'weirdness' if you will, which makes perfect sense in retrospect.

Thanks for the prompt clarification, and I'm glad I wasn't just
missing/misreading the primary docs.

Best,
Jeff


On Tue, Jan 1, 2013 at 4:21 PM, Duncan Murdoch wrote:

> On 13-01-01 4:35 PM, Jeff Ryan wrote:
>
>> Happy 2013!
>>
>> Can someone with more knowledge of edge case scoping/eval rules explain
>> what is happening below?  Happens in all the versions of R I have on hand.
>>
>
> Even though it is used as a default in a number of places, the pos==-1
> value is really poorly documented.  You need to look in the source, in
> particular src/main/envir.c, function pos2env.  There you'll see that
> pos==-1 is special cased to be the environment from which pos.to.env
> (or as.environment in your case) was called.  For non-negative values, it
> indexes the search list (i.e. the list returned by search().)  Other values
> are errors.
>
> The trouble in your examples is that this location varies.  In Fn1, it is
> being called in the ls() call.  In Fn2, it is in the force() call. In Fn3
> and Fn4, it's the Fn3/Fn4 call.
>
> In spite of what the docs say in ?get, I would rarely if ever use a pos
> argument to as.environment.  Use an environment and pass it as envir.
>
> Duncan Murdoch
>
>
>
>> Behavior itself is confusing, but ?as.environment also provides no clue.
>>   The term used in that doc is 'search list', which is ambiguous, but the
>> see also section mentions search(), so I would *think* that is what is
>> intended.  Either way Fn1() below can't really be explained.
>>
>> Major question is what in the world is Fn1 doing, and why is Fn2 not equal
>> to Fn3? [ Fn3/Fn4 are doing what I want. ]
>>
>>
>> Fn1 <- function(x="test",pos=-1,env=**as.environment(pos)) {
>> ls(env)
>> }
>>
>> Fn2 <- function(x="test",pos=-1,env=**as.environment(pos)) {
>> force(env)
>> ls(env)
>> }
>>
>> Fn3 <- function(x="test",pos=-1,env=**as.environment(pos)) {
>> # should be the same as force() in Fn2, but not
>> # ?force
>> # Note:
>> #
>> #This is semantic sugar: just evaluating the symbol will do the
>> #same thing (see the examples).
>> env
>> ls(env)
>> }
>>
>> Fn4 <- function(x="test",pos=-1,env=**as.environment(pos)) {
>> # same as Fn3
>> env <- env
>> ls(env)
>> }
>>
>> Fn1()
>> Fn2()
>> Fn3()
>> Fn4()
>> ls()
>>
>> ## output #
>>
>>> Fn1()
>>>
>> [1] "doTryCatch" "expr"   "handler""name"   "parentenv"
>>
>>  Fn2()
>>>
>> [1] "env" "pos" "x"
>>
>>  Fn3()
>>>
>> [1] "Fn1" "Fn2" "Fn3" "Fn4"
>>
>>  Fn4()
>>>
>> [1] "Fn1" "Fn2" "Fn3" "Fn4"
>>
>> ### .GlobalEnv
>>
>>> ls()
>>>
>> [1] "Fn1" "Fn2" "Fn3" "Fn4"
>>
>>  R.version
>>>
>> _
>> platform   x86_64-apple-darwin11.2.0
>> arch   x86_64
>> os darwin11.2.0
>> system x86_64, darwin11.2.0
>> status
>> major  2
>> minor  15.1
>> year   2012
>> month  06
>> day22
>> svn rev59600
>> language   R
>> version.string R version 2.15.1 (2012-06-22)
>> nickname   Roasted Marshmallows
>>
>>  R.version
>>>
>> _
>> platform   x86_64-apple-darwin11.2.0
>> arch   x86_64
>> os darwin11.2.0
>> system x86_64, darwin11.2.0
>> status Under development (unstable)
>> major  3
>> minor  0.0
>> year   2012
>> month  12
>> day28
>> svn rev61464
>> language   R
>> version.string R Under development (unstable) (2012-12-28 r61464)
>> nickname   Unsuffered Consequences
>>
>>
>


-- 
Jeffrey Ryan
jeffrey.r...@lemnica.com

www.lemnica.com

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] as.matrix.Surv -- R core question/opinions

2013-01-01 Thread Thomas Lumley
I agree that it's cleaner to remove the extra attributes -- that's the
point of as.matrix.Surv(), to produce a matrix that doesn't include any
extra details of how Surv objects are implemented.

   -thomas


On Fri, Dec 7, 2012 at 3:48 AM, Terry Therneau  wrote:

> 1. A Surv object is a matrix with some extra attributes.  The
> as.matrix.Surv function removes the extras but otherwise leaves it as is.
>
> 2. The last several versions of the survival library were accidentally
> missing the S3method('as.matrix', 'Surv') line from their NAMESPACE file.
>  (Instead it's position is held by a duplicate of the line just above it in
> the NAMESPACE file, suggesting a copy/paste error).  As a consequence the
> as.matrix.Surv function was effectively ignored, and the default method was
> being used.
>The as.matrix.default function leaves anything with a "dim" attribute
> alone.
>
> 3. In my current about-to-submit-to-CRAN  version of survival the missing
> NAMESPACE line was restored.  This breaks one function in one package (rms)
> which calls "as.matrix(y)" on a Surv object but then later looks at the
> "type" attribute of y.
>
>  So now to the design question: should the as.matrix.Surv function
> "sanitize" the result by removing the extra attributes, or should it leave
> them alone?  The first seems cleaner; my accidental multi-year test of
> leaving them in, however, clearly shows that it does no harm.
>
> Terry T.
>
> __**
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/**listinfo/r-devel
>



-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Understanding svd usage and its necessity in generalized inverse calculation

2013-01-01 Thread Thomas Lumley
On Thu, Dec 6, 2012 at 12:58 PM, Paul Johnson  wrote:

> Dear R-devel:
>
> I could use some advice about matrix calculations and steps that might
> make for faster computation of generalized inverses. It appears in
> some projects there is a bottleneck at the use of svd in calculation
> of generalized inverses.
>
> Here's some Rprof output I need to understand.
>
> >   summaryRprof("Amelia.out")
> $by.self
>  self.time self.pct total.time total.pct
> "La.svd"150.3427.66 164.82 30.32
>
>
[snip]


> I *Think* this means that a bottlleneck here is svd, which is being
>

A bottleneck to a limited extent -- it's still only 30% of computation
time, so speeding it up can't save more than that


> called by this function that calculates generalized inverses:
>
> ## Moore-Penrose Inverse function (aka Generalized Inverse)
> ##   X:symmetric matrix
> ##   tol:  convergence requirement
> mpinv <- function(X, tol = sqrt(.Machine$double.eps)) {
>   s <- svd(X)
>   e <- s$d
>   e[e > tol] <- 1/e[e > tol]
>   s$v %*% diag(e,nrow=length(e)) %*% t(s$u)
> }
>
> That is from the Amerlia package, which we like to use very much.
>
> Basically, I wonder if I should use a customized generalized inverse
> or svd calculator to make this faster.
>
>

If your matrix is produced as a covariance matrix, so that it really is
symmetric positive semidefinite up to rounding error, my understanding is
that the Cholesky approach is stable in the sense that it will reliably
produce an accurate inverse or a zero-to-within-tolerance pivot and error
message.

Now, if most of the matrices you are trying to invert are actually
invertible (as I would hope), it may be quicker to use the Cholesky
approach will a fallback to the SVD for semidefinite matrices. That is,
something like

tryCatch(
chol2inv(chol(xx)),
error=function(e) ginv(xx)
  )

Most of the  time you will get the Cholesky branch, which is much faster
(about five-fold for 10x10 matrices on my system).  On my system and using
a 10x10 matrix the overhead in the tryCatch() is much smaller than the time
taken by either set of linear algebra, so there is a net gain as long as
even a reasonably minority of the matrices are actually invertible.

You can probably do slightly better by replacing the chol2inv() with
backsolve(): solving just the systems of linear equations you need is
usually preferable to constructing a matrix inverse.


Note that this approach will give wrong answers without warning if the
matrices are not symmetric.

   -thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] sQuote() on zero-length inputs

2013-01-01 Thread Ben Bolker

  It's not desperately important, but it would seem more consistent to
me if sQuote(character(0)) or sQuote(NULL) returned character(0) rather
than "‘’" .  This could easily be achieved by putting

if (length(x)==0) return(character(0))

at the beginning ...

  Ben Bolker

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Behavior or as.environment in function arguments/call (and force() behaviors...)

2013-01-01 Thread Prof Brian Ripley

On 01/01/2013 22:21, Duncan Murdoch wrote:

On 13-01-01 4:35 PM, Jeff Ryan wrote:

Happy 2013!

Can someone with more knowledge of edge case scoping/eval rules explain
what is happening below?  Happens in all the versions of R I have on
hand.


Even though it is used as a default in a number of places, the pos==-1
value is really poorly documented.  You need to look in the source, in


Hmm, I corrected that yesterday in R-patched and R-devel.


particular src/main/envir.c, function pos2env.  There you'll see that
pos==-1 is special cased to be the environment from which pos.to.env
(or as.environment in your case) was called.  For non-negative values,
it indexes the search list (i.e. the list returned by search().)  Other
values are errors.


Actually for positive values: 0 is also an error.



The trouble in your examples is that this location varies.  In Fn1, it
is being called in the ls() call.  In Fn2, it is in the force() call. In
Fn3 and Fn4, it's the Fn3/Fn4 call.

In spite of what the docs say in ?get, I would rarely if ever use a pos
argument to as.environment.  Use an environment and pass it as envir.

Duncan Murdoch



Behavior itself is confusing, but ?as.environment also provides no clue.
  The term used in that doc is 'search list', which is ambiguous, but the
see also section mentions search(), so I would *think* that is what is
intended.  Either way Fn1() below can't really be explained.

Major question is what in the world is Fn1 doing, and why is Fn2 not
equal
to Fn3? [ Fn3/Fn4 are doing what I want. ]


Fn1 <- function(x="test",pos=-1,env=as.environment(pos)) {
ls(env)
}

Fn2 <- function(x="test",pos=-1,env=as.environment(pos)) {
force(env)
ls(env)
}

Fn3 <- function(x="test",pos=-1,env=as.environment(pos)) {
# should be the same as force() in Fn2, but not
# ?force
# Note:
#
#This is semantic sugar: just evaluating the symbol will do the
#same thing (see the examples).
env
ls(env)
}

Fn4 <- function(x="test",pos=-1,env=as.environment(pos)) {
# same as Fn3
env <- env
ls(env)
}

Fn1()
Fn2()
Fn3()
Fn4()
ls()

## output #

Fn1()

[1] "doTryCatch" "expr"   "handler""name"   "parentenv"


Fn2()

[1] "env" "pos" "x"


Fn3()

[1] "Fn1" "Fn2" "Fn3" "Fn4"


Fn4()

[1] "Fn1" "Fn2" "Fn3" "Fn4"

### .GlobalEnv

ls()

[1] "Fn1" "Fn2" "Fn3" "Fn4"


R.version

_
platform   x86_64-apple-darwin11.2.0
arch   x86_64
os darwin11.2.0
system x86_64, darwin11.2.0
status
major  2
minor  15.1
year   2012
month  06
day22
svn rev59600
language   R
version.string R version 2.15.1 (2012-06-22)
nickname   Roasted Marshmallows


R.version

_
platform   x86_64-apple-darwin11.2.0
arch   x86_64
os darwin11.2.0
system x86_64, darwin11.2.0
status Under development (unstable)
major  3
minor  0.0
year   2012
month  12
day28
svn rev61464
language   R
version.string R Under development (unstable) (2012-12-28 r61464)
nickname   Unsuffered Consequences



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel