[Rd] RFC: diag(x, n) not preserving integer and logical x

2014-08-07 Thread Martin Maechler
This is not at all something new(*). As maintainer of the
Matrix  package, I don't like this inconsistency of base R's diag().
We have had the following -- forever, almost surely inherited
from S and S+  :

diag(x) preserves the storage mode of x  for  'complex' and
'double' precision,  but converts integer and logicals to double :

  > storage.mode(x <- 1i + 1:7); storage.mode(diag(x))
  [1] "complex"
  [1] "complex"
  > storage.mode(x <- 0 + 1:7);  storage.mode(diag(x))
  [1] "double"
  [1] "double"

  > storage.mode(x <- 1:7);  storage.mode(diag(x))
  [1] "integer"
  [1] "double"
  > storage.mode(x <- 1:7 > 3);  storage.mode(diag(x))
  [1] "logical"
  [1] "double"

and so it is actually a bit cumbersome (and a memory waste in
the case of large matrices) to create a diagonal integer or
logical matrix.

The help page does not mention the current behavior, though you
may say it alludes to the fact that logicals are treated as 0/1
implicitly (**)

If I change this behavior such that logical and integer x are
preserved,

make check-all

which includes all checks, including those of all recommended
packages (including Matrix!) successfully runs through; so at
least  base + Recommended R never relies on the current
behavior, nor should any "well programmed" R code ...

Hence my proposal, somewhat tentative for now,
to change this  diag(.) behavior.

Martin Maechler

*) and possibly something we "can not" change in R, because too
   much code implicitely may be depending on it,  but now I hope
   we can still...

**) BTW, also including the somewhat amusing case of diag(c("A","B")).

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] portableParalleSeeds Package violation, CRAN exception?

2014-08-07 Thread Michael Lawrence
I would recommend against maintaining multiple global variables and would
instead take an object-oriented approach. Probably should define a
reference class representing a random number stream (think
java.util.Random). Then define a reference class representing a collection
of them, tracking which one is current. Instantiate the collection class
and keep a reference to it inside your package namespace. You could then
define a new API that expects a random stream object as an argument, using
the active one as the default. Your wrappers would delegate to that API,
relying on the default.


On Wed, Aug 6, 2014 at 11:10 AM, Paul Johnson  wrote:

> I'm writing to ask for a policy exception, or advice on how to make
> this package CRAN allowable.
>
> http://rweb.quant.ku.edu/kran/src/contrib/portableParallelSeeds_0.9.tar.gz
>
> Yesterday I tried to submit a package on CRAN and Dr Ripley pointed
> out that I had not understood the instructions about packages.  Here's
> the part where the R check gives a Note
>
> * checking R code for possible problems ... NOTE
> Found the following assignments to the global environment:
> File ‘portableParallelSeeds/R/initPortableStreams.R’:
>assign("currentStream", n, envir = .GlobalEnv)
>assign("currentStates", curStates, envir = .GlobalEnv)
>assign("currentStream", 1L, envir = .GlobalEnv)
>assign("startStates", runSeeds, envir = .GlobalEnv)
>assign("currentStates", runSeeds, envir = .GlobalEnv)
>assign("currentStream", as.integer(currentStream), envir = .GlobalEnv)
>assign("startStates", runSeeds, envir = .GlobalEnv)
>assign("currentStates", runSeeds, envir = .GlobalEnv)
>
> Altering the user's environment requires a special arrangement with
> CRAN. I believe this is justified, I'll sketch the reasons now. But,
> mostly, I'm at your mercy and if there is any way to make this
> possible, I would be very grateful.
>
> To control & replace random number streams, it really is necessary to
> alter the workspace. That's where the random generator state is
> stored.  It is acknowledged in Robert Gentleman' s Book, R Programming
> for Bionformatics "The decision to have these [random generator]
> functions manipulate a global variable, .Random.seed, is slightly
> unfortunate as it makes it somewhat more difficult to manage several
> different random number streams simultaneously” (Gentleman, 2009, p.
> 201).
>
> I have developed an understandable set of wrapper functions that handle
> this.
>
> Some of you may recall this project. I've asked about it here a couple
> of times. We allow separate streams of randoms for different purposes
> within a single R run. There is a framework to save 1000s of those
> sets in a file, so it can be used on a cluster or in a single
> workstation.  This is handy because, when 1 run in 10,000 on the
> cluster exhibits some weird behavior, we can easily re-initiate that
> interactively and see what's going on.
>
> I have a  vignette "pps" that explains. I dropped a copy of that here
> in case you don't want to get the package:
>
> http://pj.freefaculty.org/scraps/pps.pdf
>
> While working on that, I gained a considerably deeper understanding of
> random generators and seeds.  That is what this vignette is about
>
> http://pj.freefaculty.org/scraps/PRNG-basics.pdf
>
>
> We've been running simulations on our cluster with the
> portableParallelSeeds framework for 2 years, we've never had any
> trouble.  We are able to re-start runs, verify random number draws in
> separate streams.
>
> PJ
> --
> Paul E. Johnson
> Professor, Political Science  Assoc. Director
> 1541 Lilac Lane, Room 504  Center for Research Methods
> University of Kansas University of Kansas
> http://pj.freefaculty.org   http://quant.ku.edu
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] How to (appropropriately) use require in a package?

2014-08-07 Thread Joshua Wiley
Dear All,

What is the preferred way for Package A, to initialize a cluster, and load
Package B on all nodes?

I am writing a package that parallelizes some functions through the use of
a cluster if useRs are on a Windows machine (using parLapply and family).
 I also make use of another package in some of my code, so it is necessary
to load the required packages on each slave once the cluster is started.

Right now, I have done this, by evaluating require(packages) on each slave;
however, Rcmd check has a note that I should remove the "require" in my
code.

Thanks!

Josh

-- 
Joshua F. Wiley
Ph.D. Student, UCLA Department of Psychology
http://joshuawiley.com/
Senior Analyst, Elkhart Group Ltd.
http://elkhartgroup.com
Office: 260.673.5518

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] How to (appropropriately) use require in a package?

2014-08-07 Thread Chris Green
I am not an expert here, but if it's a package, couldn't (shouldn't?) you
include Package B in one of the Depends: or Imports: lines in the
DESCRIPTION file? That would ensure Package B is automatically made
accessible whenever Package A is loaded. For example, see the Writing R
Extensions manual:

http://cran.fhcrc.org/doc/manuals/r-release/R-exts.html#Package-Dependencies


Chris Green
Ph.D. Student, Statistics
University of Washington, Seattle




On Thu, Aug 7, 2014 at 4:35 PM, Joshua Wiley  wrote:

> Dear All,
>
> What is the preferred way for Package A, to initialize a cluster, and load
> Package B on all nodes?
>
> I am writing a package that parallelizes some functions through the use of
> a cluster if useRs are on a Windows machine (using parLapply and family).
>  I also make use of another package in some of my code, so it is necessary
> to load the required packages on each slave once the cluster is started.
>
> Right now, I have done this, by evaluating require(packages) on each slave;
> however, Rcmd check has a note that I should remove the "require" in my
> code.
>
> Thanks!
>
> Josh
>
> --
> Joshua F. Wiley
> Ph.D. Student, UCLA Department of Psychology
> http://joshuawiley.com/
> Senior Analyst, Elkhart Group Ltd.
> http://elkhartgroup.com
> Office: 260.673.5518
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] How to (appropropriately) use require in a package?

2014-08-07 Thread Joshua Wiley
Someone kindly pointed out that it is not clear from my email why Depends
will not work.  A more complete example is:

PkgA:
f <- function(ncores) {
  cl <- makeCluster(ncores)

  clusterEvalQ(cl, {
require(PkgB)
  })
  [other code]

  ### this is the code I want to work and need to be able to call
  ### PkgB functions on each of the cluster slaves
  output <- parLapply(cl, 1:n, function(i) {
[code from my package and using some functions from PkgB]
  })

}

As far as I know, just because I add PkgB to the Depends (or imports,
whatever) of PkgA, does not mean that the cluster started by PkgA will
automatically have PkgB loaded and functions available.

Thanks!



On Fri, Aug 8, 2014 at 9:35 AM, Joshua Wiley  wrote:

> Dear All,
>
> What is the preferred way for Package A, to initialize a cluster, and load
> Package B on all nodes?
>
> I am writing a package that parallelizes some functions through the use of
> a cluster if useRs are on a Windows machine (using parLapply and family).
>  I also make use of another package in some of my code, so it is necessary
> to load the required packages on each slave once the cluster is started.
>
> Right now, I have done this, by evaluating require(packages) on each
> slave; however, Rcmd check has a note that I should remove the "require" in
> my code.
>
> Thanks!
>
> Josh
>
> --
> Joshua F. Wiley
> Ph.D. Student, UCLA Department of Psychology
> http://joshuawiley.com/
> Senior Analyst, Elkhart Group Ltd.
> http://elkhartgroup.com
> Office: 260.673.5518
>



-- 
Joshua F. Wiley
Ph.D. Student, UCLA Department of Psychology
http://joshuawiley.com/
Senior Analyst, Elkhart Group Ltd.
http://elkhartgroup.com
Office: 260.673.5518

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] How to (appropropriately) use require in a package?

2014-08-07 Thread Prof Brian Ripley
The safe, elegant way to do this is to use namespace scoping: it is 
still not at all clear why 'other code' needs PkgB *on the search path*.


In other cases seen in CRAN submissions, 'other code' has been in PkgA's 
namespace, and hence things in PkgB's exports have been visible as it 
was imported by PkgA and hence in the environment tree for functions in 
PkgA.  Then namespace scoping will ensure that PkgB's namespace is 
loaded on the cluster workers.



On 08/08/2014 00:58, Joshua Wiley wrote:

Someone kindly pointed out that it is not clear from my email why Depends
will not work.  A more complete example is:

PkgA:
f <- function(ncores) {
   cl <- makeCluster(ncores)

   clusterEvalQ(cl, {
 require(PkgB)
   })
   [other code]

   ### this is the code I want to work and need to be able to call
   ### PkgB functions on each of the cluster slaves
   output <- parLapply(cl, 1:n, function(i) {
 [code from my package and using some functions from PkgB]
   })

}

As far as I know, just because I add PkgB to the Depends (or imports,
whatever) of PkgA, does not mean that the cluster started by PkgA will
automatically have PkgB loaded and functions available.

Thanks!



On Fri, Aug 8, 2014 at 9:35 AM, Joshua Wiley  wrote:


Dear All,

What is the preferred way for Package A, to initialize a cluster, and load
Package B on all nodes?

I am writing a package that parallelizes some functions through the use of
a cluster if useRs are on a Windows machine (using parLapply and family).
  I also make use of another package in some of my code, so it is necessary
to load the required packages on each slave once the cluster is started.

Right now, I have done this, by evaluating require(packages) on each
slave; however, Rcmd check has a note that I should remove the "require" in
my code.

Thanks!

Josh

--
Joshua F. Wiley
Ph.D. Student, UCLA Department of Psychology
http://joshuawiley.com/
Senior Analyst, Elkhart Group Ltd.
http://elkhartgroup.com
Office: 260.673.5518








--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel