[Rd] "Undocumented code objects" message from missing alias

2010-07-11 Thread John Maindonald
If I omit the \alias{houseprices} from the second line of a data
documentation file houseprices.Rd, R CMD check gives the warning message:

"
Undocumented code objects:
 houseprices
All user-level objects in a package should have documentation entries.
See the chapter 'Writing R documentation files' in manual 'Writing R
Extensions'.
"

I can build the package without error, but the help file for houseprices
is missing.  It would be good to have a message that gives a more
direct indication of what is wrong.

Running R CMD check on the directory at 
http://www.maths.anu.edu.au/~johnm/r/packages/testpkg
demonstrates the point.

John Maindonald email: john.maindon...@anu.edu.au
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
http://www.maths.anu.edu.au/~johnm

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Large discrepancies in the same object being saved to .RData

2010-07-11 Thread Tony Plate
Another way of seeing the environments referenced in an object is using 
str(), e.g.:


> f1 <- function() {
+ junk <- rnorm(1000)
+ x <- 1:3
+ y <- rnorm(3)
+ lm(y ~ x)
+ }
> v1 <- f1()
> object.size(f1)
1636 bytes
> grep("Environment", capture.output(str(v1)), value=TRUE)
[1] "  .. ..- attr(*, \".Environment\")= "
[2] "  .. .. ..- attr(*, \".Environment\")= "
>

-- Tony Plate

On 7/10/2010 10:10 PM, bill.venab...@csiro.au wrote:

Well, I have answered one of my questions below.  The hidden
environment is attached to the 'terms' component of v1.

To see this

   

lapply(v1, environment)
 

$coefficients
NULL

$residuals
NULL

$effects
NULL

$rank
NULL

$fitted.values
NULL

$assign
NULL

$qr
NULL

$df.residual
NULL

$xlevels
NULL

$call
NULL

$terms


$model
NULL

   

rm(junk, envir = with(v1, environment(terms)))
usedVcells()
 

[1] 96532
   


 

This is still a bit of a trap for young (and old!) players...

I think the main point in my mind is why is it that object.size()
excludes enclosing environments in its reckonings?

Bill Venables.

-Original Message-
From: Venables, Bill (CMIS, Cleveland)
Sent: Sunday, 11 July 2010 11:40 AM
To: 'Duncan Murdoch'; 'Paul Johnson'
Cc: 'r-devel@r-project.org'; Taylor, Julian (CMIS, Waite Campus)
Subject: RE: [Rd] Large discrepancies in the same object being saved to .RData

I'm still a bit puzzled by the original question.  I don't think it
has much to do with .RData files and their sizes.  For me the puzzle
comes much earlier.  Here is an example of what I mean using a little
session

   

usedVcells<- function() gc()["Vcells", "used"]
usedVcells()### the base load
 

[1] 96345

### Now look at what happens when a function returns a formula as the
### value, with a big item floating around in the function closure:

   

f0<- function() {
 

+ junk<- rnorm(1000)
+ y ~ x
+ }
   

v0<- f0()
usedVcells()   ### much bigger than base, why?
 

[1] 10096355
   

v0 ### no obvious envirnoment
 

y ~ x
   

object.size(v0)  ### so far, no clue given where
 

### the extra Vcells are located.
372 bytes

### Does v0 have an enclosing environment?

   

environment(v0) ### yep.
 


   

ls(envir = environment(v0)) ### as expected, there's the junk
 

[1] "junk"
   

rm(junk, envir = environment(v0))  ### this does the trick.
usedVcells()
 

[1] 96355

### Now consider a second example where the object
### is not a formula, but contains one.

   

f1<- function() {
 

+ junk<- rnorm(1000)
+ x<- 1:3
+ y<- rnorm(3)
+ lm(y ~ x)
+ }

   

v1<- f1()
usedVcells()  ### as might have been expected.
 

[1] 10096455

### in this case, though, there is no
### (obvious) enclosing environment

   

environment(v1)
 

NULL
   

object.size(v1)  ### so where are the junk Vcells located?
 

7744 bytes
   

ls(envir = environment(v1))  ### clearly wil not work
 

Error in ls(envir = environment(v1)) : invalid 'envir' argument

   

rm(v1) ### removing the object does clear out the junk.
usedVcells()
 

[1] 96366
   
 

And in this second case, as noted by Julian Taylor, if you save() the
object the .RData file is also huge.  There is an environment attached
to the object somewhere, but it appears to be occluded and entirely
inaccessible.  (I have poked around the object components trying to
find the thing but without success.)

Have I missed something?

Bill Venables.

-Original Message-
From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On 
Behalf Of Duncan Murdoch
Sent: Sunday, 11 July 2010 10:36 AM
To: Paul Johnson
Cc: r-devel@r-project.org
Subject: Re: [Rd] Large discrepancies in the same object being saved to .RData

On 10/07/2010 2:33 PM, Paul Johnson wrote:
   

On Wed, Jul 7, 2010 at 7:12 AM, Duncan Murdoch  wrote:

 

On 06/07/2010 9:04 PM, julian.tay...@csiro.au wrote:

   

Hi developers,



After some investigation I have found there can be large discrepancies in
the same object being saved as an external "xx.RData" file. The immediate
repercussion of this is the possible increased size of your .RData workspace
for no apparent reason.




 

I haven't worked through your example, but in general the way that local
objects get captured is when part of the return value includes an
environment.

   

Hi, can I ask a follow up question?

Is there a tool to browse *.Rdata files without loading them into R?

 

I don't know of one.  You can load the whole file into an empty
environment, but then you lose information about "where did it come from"?

Duncan Murdoch
   

In HDF5 (a data storage format we use sometimes), there is a CLI
program "h5dump" that will spit out line-by-line all the contents of a
storage entity.  It will literally track through all the metadata, all
the vectors of scores, etc.  I've found that handy to "see what's
really  in there" in cases like the one that OP asked a

Re: [Rd] Large discrepancies in the same object being saved to .RData

2010-07-11 Thread Prof Brian Ripley

On Sun, 11 Jul 2010, Tony Plate wrote:

Another way of seeing the environments referenced in an object is using 
str(), e.g.:



f1 <- function() {

+ junk <- rnorm(1000)
+ x <- 1:3
+ y <- rnorm(3)
+ lm(y ~ x)
+ }

v1 <- f1()
object.size(f1)

1636 bytes

grep("Environment", capture.output(str(v1)), value=TRUE)

[1] "  .. ..- attr(*, \".Environment\")= "
[2] "  .. .. ..- attr(*, \".Environment\")= "


'Some of the environments in a few cases': remember environments have 
environments (and so on), and that namespaces and packages are also 
environments.  So we need to know about the environment of 
environment(v1$terms), which also gets saved (either as a reference or 
as an environment, depending on what it is).


And this approach does not work for many of the commonest cases:


f <- function() {

+ x <- pi
+ g <- function() print(x)
+ return(g)
+ }

g <- f()
str(g)

function ()
 - attr(*, "source")= chr "function() print(x)"

ls(environment(g))

[1] "g" "x"

In fact I think it works only for formulae.


-- Tony Plate

On 7/10/2010 10:10 PM, bill.venab...@csiro.au wrote:

Well, I have answered one of my questions below.  The hidden
environment is attached to the 'terms' component of v1.


Well, not really hidden.  A terms component is a formula (see 
?terms.object), and a formula has an environment just as a closure 
does.  In neither case does the print() method tell you about it -- 
but ?formula does.



To see this



lapply(v1, environment)


$coefficients
NULL

$residuals
NULL

$effects
NULL

$rank
NULL

$fitted.values
NULL

$assign
NULL

$qr
NULL

$df.residual
NULL

$xlevels
NULL

$call
NULL

$terms


$model
NULL



rm(junk, envir = with(v1, environment(terms)))
usedVcells()


[1] 96532





This is still a bit of a trap for young (and old!) players...

I think the main point in my mind is why is it that object.size()
excludes enclosing environments in its reckonings?

Bill Venables.

-Original Message-
From: Venables, Bill (CMIS, Cleveland)
Sent: Sunday, 11 July 2010 11:40 AM
To: 'Duncan Murdoch'; 'Paul Johnson'
Cc: 'r-devel@r-project.org'; Taylor, Julian (CMIS, Waite Campus)
Subject: RE: [Rd] Large discrepancies in the same object being saved to 
.RData


I'm still a bit puzzled by the original question.  I don't think it
has much to do with .RData files and their sizes.  For me the puzzle
comes much earlier.  Here is an example of what I mean using a little
session



usedVcells<- function() gc()["Vcells", "used"]
usedVcells()### the base load


[1] 96345

### Now look at what happens when a function returns a formula as the
### value, with a big item floating around in the function closure:



f0<- function() {


+ junk<- rnorm(1000)
+ y ~ x
+ }


v0<- f0()
usedVcells()   ### much bigger than base, why?


[1] 10096355


v0 ### no obvious envirnoment


y ~ x


object.size(v0)  ### so far, no clue given where


### the extra Vcells are located.
372 bytes

### Does v0 have an enclosing environment?



environment(v0) ### yep.





ls(envir = environment(v0)) ### as expected, there's the junk


[1] "junk"


rm(junk, envir = environment(v0))  ### this does the trick.
usedVcells()


[1] 96355

### Now consider a second example where the object
### is not a formula, but contains one.



f1<- function() {


+ junk<- rnorm(1000)
+ x<- 1:3
+ y<- rnorm(3)
+ lm(y ~ x)
+ }



v1<- f1()
usedVcells()  ### as might have been expected.


[1] 10096455

### in this case, though, there is no
### (obvious) enclosing environment



environment(v1)


NULL


object.size(v1)  ### so where are the junk Vcells located?


7744 bytes


ls(envir = environment(v1))  ### clearly wil not work


Error in ls(envir = environment(v1)) : invalid 'envir' argument



rm(v1) ### removing the object does clear out the junk.
usedVcells()


[1] 96366




And in this second case, as noted by Julian Taylor, if you save() the
object the .RData file is also huge.  There is an environment attached
to the object somewhere, but it appears to be occluded and entirely
inaccessible.  (I have poked around the object components trying to
find the thing but without success.)

Have I missed something?

Bill Venables.

-Original Message-
From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] 
On Behalf Of Duncan Murdoch

Sent: Sunday, 11 July 2010 10:36 AM
To: Paul Johnson
Cc: r-devel@r-project.org
Subject: Re: [Rd] Large discrepancies in the same object being saved to 
.RData


On 10/07/2010 2:33 PM, Paul Johnson wrote:

On Wed, Jul 7, 2010 at 7:12 AM, Duncan Murdoch 
wrote:




On 06/07/2010 9:04 PM, julian.tay...@csiro.au wrote:



Hi developers,



After some investigation I have found there can be large discrepancies 
in
the same object being saved as an external "xx.RData" file. The 
immediate
repercussion of this is the possible increased size of your .RData 
workspace

for no apparent reason.






I haven't worked through your example, but in

Re: [Rd] LinkingTo and C++

2010-07-11 Thread Dominick Samperi
While linking to package shared libs is not possible in general, as Simon
point out, it is
possible under Windows, provided Windows knows how to find the library
linked to
at runtime (this requires a customized Makefile.win). One way this is done
under
Windows is simply to place the package/libs directories containing the
package
shared libs to be linked to in the Windows search path, but this may be a
problem for packages released to CRAN since this would require updates
to CRAN's PATH environment variable.

Another possibility is to load all packages containing shared libs to be
linked to
before using any shared lib that is dynamically linked to them. For example,
if
B.dll is dynamically linked to A.dll (and both A and B are packages), and if
foo() is a function in B.dll that uses functions in A.dll, then this will
work:

library(A)
library(B)
.Call('foo')

But this is not a very natural or convenient solution. It requires that
library()
commands be used with every instance of Rscript, for example.

A better solution would be some variant of LinkingTo: A that somehow has the
same effect as setting Windows search path to include the libs directory
containing A.dll and B.dll.

Of course, most of the issues disappear when linking to static libs instead
of dynamic ones, and it is not clear that the extra effort needed to support
dynamic libs will yield much benefit in this case.

Dominick

On Thu, Feb 11, 2010 at 12:55 PM, Simon Urbanek  wrote:

> Romain,
>
> I think your'e confusing two entirely different concepts here:
>
> 1) LinkingTo: allows a package to provide C-level functions to other
> packages (see R-ext 5.4). Let's say package A provides a function foo by
> calling R_RegisterCCallable for that function. If a package B wants to use
> that function, it uses LinkingTo: and calls R_GetCCallable to obtain the
> function pointer. It does not actually link to package A because that is in
> general not possible - it simply obtains the pointers through R. In
> addition, LinkingTo: makes sure that you have access to the header files of
> package A which help you to cast the functions and define any data
> structures you may need. Since C++ is a superset of C you can use this
> facility with C++ as long as you don't depend on anything outside of the
> header files.
>
> 2) linking directly to another package's shared object is in general not
> possible, because packages are not guaranteed to be dynamic libraries. They
> are usually shared objects which may or may not be compatible with a dynamic
> library on a given platform. Therefore the R-ext describes other way in
> which you may provide some library independently of the package shared
> object to other packages (see R-ext 5.8). The issue is that you have to
> create a separate library (PKG/libs[/arch]/PKG.so won't work in general!)
> and provide this to other packages. As 5.8 says, this is in general not
> trivial because it is very platform dependent and the most portable way is
> to offer a static library.
>
> To come back to your example, LinkingTo: A and B will work if you remove
> Makevars from B (you don't want to link) and put your hello method into the
> A.h header:
>
> > library (B)
> Loading required package: A
> > .Call("say_hello", PACKAGE = "B")
> [1] "hello"
>
> However, your'e not really using the LinkingTo: facilities for the
> functions so it's essentially just helping you to find the header file.
>
> Cheers,
> Simon
>
>
>
> On Feb 11, 2010, at 4:08 AM, Romain Francois wrote:
>
> > Hello,
> >
> > I've been trying to make LinkingTo work when the package linked to has
> c++ code.
> >
> > I've put dumb packages to illustrate this emails here ;
> http://addictedtor.free.fr/misc/linkingto
> >
> > Package A defines this C++ class:
> >
> > class A {
> > public:
> >   A() ;
> >   ~A() ;
> >   SEXP hello() ;
> > } ;
> >
> > Package B has this function :
> >
> > SEXP say_hello(){
> >   A a ;
> >   return a.hello() ;
> > }
> >
> > headers of package A are copied into inst/include so that package B can
> have.
> >
> > LinkingTo: A
> >
> > in its DESCRIPTION file.
> >
> > Also, package B has the R function ;
> >
> > g <- function(){
> >   .Call("say_hello", PACKAGE = "B")
> > }
> >
> > With this I can compile A and B, but then I get :
> >
> > $ Rscript -e "B::g()"
> > Error in dyn.load(file, DLLpath = DLLpath, ...) :
> >  unable to load shared library '/usr/local/lib/R/library/B/libs/B.so':
> >  /usr/local/lib/R/library/B/libs/B.so: undefined symbol: _ZN1AD1Ev
> > Calls: :: ... tryCatch -> tryCatchList -> tryCatchOne -> 
> >
> > If I then add a Makevars in B with this :
> >
> >
> > # find the root directory where A is installed
> > ADIR=$(shell $(R_HOME)/bin/Rscript -e "cat(system.file(package='A'))" )
> >
> > PKG_LIBS= $(ADIR)/libs/A$(DYLIB_EXT)
> >
> >
> > Then it works:
> >
> > $ Rscript -e "B::g()"
> > [1] "hello"
> >
> > So it appears that adding the -I flag, which is what LinkingTo does is
> not enough when the package "

Re: [Rd] Large discrepancies in the same object being saved to .RData

2010-07-11 Thread Duncan Murdoch

On 11/07/2010 1:30 PM, Prof Brian Ripley wrote:

On Sun, 11 Jul 2010, Tony Plate wrote:

Another way of seeing the environments referenced in an object is using 
str(), e.g.:



f1 <- function() {

+ junk <- rnorm(1000)
+ x <- 1:3
+ y <- rnorm(3)
+ lm(y ~ x)
+ }

v1 <- f1()
object.size(f1)

1636 bytes

grep("Environment", capture.output(str(v1)), value=TRUE)

[1] "  .. ..- attr(*, \".Environment\")= "
[2] "  .. .. ..- attr(*, \".Environment\")= "


'Some of the environments in a few cases': remember environments have 
environments (and so on), and that namespaces and packages are also 
environments.  So we need to know about the environment of 
environment(v1$terms), which also gets saved (either as a reference or 
as an environment, depending on what it is).


And this approach does not work for many of the commonest cases:


f <- function() {

+ x <- pi
+ g <- function() print(x)
+ return(g)
+ }

g <- f()
str(g)

function ()
  - attr(*, "source")= chr "function() print(x)"

ls(environment(g))

[1] "g" "x"

In fact I think it works only for formulae.


-- Tony Plate

On 7/10/2010 10:10 PM, bill.venab...@csiro.au wrote:

Well, I have answered one of my questions below.  The hidden
environment is attached to the 'terms' component of v1.


Well, not really hidden.  A terms component is a formula (see 
?terms.object), and a formula has an environment just as a closure 
does.  In neither case does the print() method tell you about it -- 
but ?formula does.




I've just changed the default print method for formulas to display the 
environment if it is not globalenv(), which is the rule used for 
closures as well.  So now in R-devel:


> as.formula("y ~ x")
y ~ x

as before, but

> as.formula("y ~ x", env=new.env())
y ~ x


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Saving an R program as C Code

2010-07-11 Thread Aaron J. Ferguson
Is there anyway to say R procedures or packages as C code. Ideally, I want
to run a logistic regression in R but have the code available in C, or Java
or whatever. Thoughts?

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] S4 class extends "data.frame", getDataPart sees "list"

2010-07-11 Thread Daniel Murphy
R-Devel:

When I get the data part of an S4 class that contains="data.frame", it gives
me a list, even when the "data.frame" is the S4 version:

> d<-data.frame(x=1:3)
> isS4(d)
[1] FALSE   # of course
> dS4<-new("data.frame",d)
> isS4(dS4)
[1] TRUE# ok
> class(dS4)
[1] "data.frame"   # good
attr(,"package")
[1] "methods"
> setClass("A", representation(label="character"), contains="data.frame")
[1] "A"
> a<-new("A",dS4, label="myFrame")
> getDataPart(a)
[[1]]  # oh?
[1] 1 2 3

> class(a...@.data)
[1] "list"   # hmm
> names(a)
[1] "x" # sure, that makes sense
> a
Object of class "A"
  x
1 1
2 2
3 3
Slot "label":
[1] "myFrame"


Was I wrong to have expected the "data part" of 'a' to be a "data.frame"?

Thanks.

Dan Murphy

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel