Re: [Rd] Arrays Partial unserialization

2012-09-01 Thread Bert Gunter
1. I believe this is an R-Help, not an R-devel topic. Post there.

2. It is not clear to me from your post how the arrays are stored -- as
.Rdata files or in some original tabular or database format. I believe the
answer would depend on clarifying that point -- or on others understanding
what I do not.

-- Bert

On Fri, Aug 31, 2012 at 6:47 AM, Damien Georges
wrote:

> Hi all,
>
> I'm working with some huge array in R and I need to load several ones to
> apply some functions that requires to have all my arrays values for each
> cell...
>
> To make it possible, I would like to load only a part (for example 100
> cells) of all my arrays, apply my function, delete all cells loaded, loaded
> following cells and so on.
>
> Is it possible to unserialize (or load) only a defined part of an R array ?
> Do you know some tools that might help me?
>
> Finally, I did lot of research to find the way array (and all other R
> object) are serialized into binary object, but I found nothing explaining
> really algorithms involved. If someone has some information on this topic,
> I'm interesting in.
>
> Hoping my request is understandable,
>
> All the best,
>
> Damien.G
>
> __**
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/**listinfo/r-devel
>



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Environment when NextMethod is used

2012-09-01 Thread Winston Chang
I'm running into some hard-to-understand behavior with the evaluation
environment when NextMethod is used. I'm using square-bracket indexing
into objects, and the evaluation environment of the expression inside
the square brackets seems to change depending on what kind of
comparison operators are used.

This behavior happens when the following conditions are met (this is
what I've found; I'm sure that these aren't necessary and sufficient
conditions):
- I call a function from an attached package.
- The function uses square bracket indexing with a class that has its
own definition of the operator, such as `[.factor` or `[.POSIXct`. (If
a vector of numerics is used, the error doesn't happen.)
- The indexing function uses NextMethod("[").
- An S3 method is used within the square brackets. (When a regular
function is used, there's no error.)
- The S3 method is from a package that is an import for the original
function's package, but this package is not attached. (If the package
is attached, then the error doesn't happen because R finds the method
in the standard search path.)
- An operator like == is used. (If the %in% operator is used, the
error doesn't happen.)


This may sound very abstract. I've created a sample package that
illustrates the behavior. The package is called envtest, and it has a
function called envtest(), which uses an S3 method from the nlme
package. nlme is listed as an import.

You can either clone the repository here:
  https://github.com/wch/envtest
Or you can install it with devtools, using:
  library(devtools)
  dev_mode()
  install_github('envtest', 'wch')



The envtest() function tries to index into a factor in different ways,
and prints the output for each one. This is the content of the
function. (If you load it from the global environment, it won't have
the same error, since the issue has to do with an import):

envtest <- function() {
  dat <- data.frame(x = 0, y = 0)
  f <- factor(c("a", "b"))

  # Print the starting data
  cat("\nf: ")
  cat(f)

  cat("\n\nTests with %in% operator ")

  # OK
  cat('\n"x" %in% Names(y ~ x, data = dat): ')
  cat("x" %in% Names(y ~ x, data = dat))

  # OK: Save boolean values to idx, then use f[idx]
  cat('\nidx <- "x" %in% Names(y ~ x, data = dat); f[idx] : ')
  cat({idx <- "x" %in% Names(y ~ x, data = dat); f[idx]})

  # OK: Use the expression with S3 function Names directly inside of []
  cat('\nf["x" %in% Names(y ~ x, data = dat)] : ')
  cat(f["x" %in% Names(y ~ x, data = dat)])


  cat("\n\nTests with == operator --")

  # OK
  cat('\n"x" == Names(y ~ x, data = dat)  : ')
  cat("x" == Names(y ~ x, data = dat))

  # OK: Save boolean values to idx, then use f[idx]
  cat('\nidx <- "x" == Names(y ~ x, data = dat); f[idx]   : ')
  cat({idx <- "x" == Names(y ~ x, data = dat); f[idx]})

  # Error: Use the expression with S3 function Names directly inside of []
  cat('\nf["x" == Names(y ~ x, data = dat)]   : ')
  cat(f["x" == Names(y ~ x, data = dat)])

  invisible()
}



This is what happens when I run the envtest() function. All the
indexing operations work, except the last one, where, inside the
square brackets, the == operator is used, and it calls the S3 method
from an imported package.

> library(envtest)
> envtest()
f: 1 2

Tests with %in% operator 
"x" %in% Names(y ~ x, data = dat): TRUE
idx <- "x" %in% Names(y ~ x, data = dat); f[idx] : 1 2
f["x" %in% Names(y ~ x, data = dat)] : 1 2

Tests with == operator --
"x" == Names(y ~ x, data = dat)  : FALSE TRUE
idx <- "x" == Names(y ~ x, data = dat); f[idx]   : 2
f["x" == Names(y ~ x, data = dat)]   : Error in Names(y ~
x, data = dat) : could not find function "Names"


When I set options(error=recover), it's possible to investigate the
environment where it's trying to evaluate the expression Names(),
when it runs into the error:

Enter a frame number, or 0 to exit

1: envtest()
2: envtest.r#40: cat(f["x" == Names(y ~ x, data = dat)])
3: f["x" == Names(y ~ x, data = dat)]
4: `[.factor`(f, "x" == Names(y ~ x, data = dat))
5: NextMethod("[")
6: Names(y ~ x, data = dat)

Selection: 5

Browse[1]> environment()

Browse[1]> parent.env(environment())

Browse[1]> parent.env(parent.env(environment()))



When == is used, it tries to evaluate the expression Names() in
the environment namespace:base, and it fails because it can't find the
function.
However, when %in% is used, it tries to evaluate the expression
Names() in the environment namespace:nlme, which makes more sense
to me.


Is this expected behavior? And if so, could someone explain why it
should be expected? I'm confused as to why the evaluation environment
should change when a certain narrow set of conditions is met.