[Rd] setting .libPaths() with parallel::clusterCall

2020-12-22 Thread Mark van der Loo
Dear all,

It is not possible to set library paths on worker nodes with
parallel::clusterCall (or snow::clusterCall) and I wonder if this is
intended behavior.

Example.

library(parallel)
libdir <- "./tmplib"
if (!dir.exists(libdir)) dir.create("./tmplib")

cl <- makeCluster(2)
clusterCall(cl, .libPaths, c(libdir, .libPaths()) )

The output is as expected with the extra libdir returned for each worker
node. However, running

clusterEvalQ(cl, .libPaths())

Shows that the library paths have not been set.

If this is indeed a bug, I'm happy to file it at bugzilla. Tested on R
4.0.3 and r-devel.

Best,
Mark
ps: a workaround is documented here:
https://www.markvanderloo.eu/yaRb/2020/12/17/how-to-set-library-path-on-a-parallel-r-cluster/


> sessionInfo()
R Under development (unstable) (2020-12-21 r79668)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.1 LTS

Matrix products: default
BLAS:   /home/mark/projects/Rdev/R-devel/lib/libRblas.so
LAPACK: /home/mark/projects/Rdev/R-devel/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=nl_NL.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=nl_NL.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=nl_NL.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=nl_NL.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats graphics  grDevices utils datasets  methods
[8] base

loaded via a namespace (and not attached):
[1] compiler_4.1.0

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] setting .libPaths() with parallel::clusterCall

2020-12-22 Thread luke-tierney

On Tue, 22 Dec 2020, Mark van der Loo wrote:


Dear all,

It is not possible to set library paths on worker nodes with
parallel::clusterCall (or snow::clusterCall) and I wonder if this is
intended behavior.

Example.

library(parallel)
libdir <- "./tmplib"
if (!dir.exists(libdir)) dir.create("./tmplib")

cl <- makeCluster(2)
clusterCall(cl, .libPaths, c(libdir, .libPaths()) )

The output is as expected with the extra libdir returned for each worker
node. However, running

clusterEvalQ(cl, .libPaths())

Shows that the library paths have not been set.


Use this:

clusterCall(cl, ".libPaths", c(libdir, .libPaths()) )

This will find the function .libPaths on the workers.

Your clusterCall sends across a serialized copy of your process'
.libPaths and calls that. Usually that is equivalent to calling the
function found by the name you used on the workers, but not when the
function has an enclosing environment that the function modifies by
assignment.

Alternate implementations of .libPaths that are more
serialization-friendly are possible in principle but probably not
practical given limitations of the base package.

The distinction between providing a function value or a character
string as the function argument to clusterCall and others could
probably use a paragraph in the help file; happy to consider a patch
if anyone wants to take a crack at it.

Best,

luke



If this is indeed a bug, I'm happy to file it at bugzilla. Tested on R
4.0.3 and r-devel.

Best,
Mark
ps: a workaround is documented here:
https://www.markvanderloo.eu/yaRb/2020/12/17/how-to-set-library-path-on-a-parallel-r-cluster/



sessionInfo()

R Under development (unstable) (2020-12-21 r79668)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.1 LTS

Matrix products: default
BLAS:   /home/mark/projects/Rdev/R-devel/lib/libRblas.so
LAPACK: /home/mark/projects/Rdev/R-devel/lib/libRlapack.so

locale:
[1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
[3] LC_TIME=nl_NL.UTF-8LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=nl_NL.UTF-8LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=nl_NL.UTF-8   LC_NAME=C
[9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=nl_NL.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats graphics  grDevices utils datasets  methods
[8] base

loaded via a namespace (and not attached):
[1] compiler_4.1.0

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] brief update on the pipe operator in R-devel

2020-12-22 Thread luke-tierney

It turns out that allowing a bare function expression on the
right-hand side (RHS) of a pipe creates opportunities for confusion
and mistakes that are too risky. So we will be dropping support for
this from the pipe operator.

The case of a RHS call that wants to receive the LHS result in an
argument other than the first can be handled with just implicit first
argument passing along the lines of

mtcars |> subset(cyl == 4) |> (\(d) lm(mpg ~ disp, data = d))()

It was hoped that allowing a bare function expression would make this
more convenient, but it has issues as outlined below. We are exploring
some alternatives, and will hopefully settle on one soon after the
holidays.

The basic problem, pointed out in a comment on Twitter, is that in
expressions of the form

1 |> \(x) x + 1 -> y
1 |> \(x) x + 1 |> \(y) x + y

everything after the \(x) is parsed as part of the body of the
function.  So these are parsed along the lines of

1 |> \(x) { x + 1 -> y }
1 |> \(x) { x + 1 |> \(y) x + y }

In the first case the result is assigned to a (useless) local
variable.  Someone writing this is more likely to have intended to
assign the result to a global variable, as this would:

(1 |> \(x) x + 1) -> y

In the second case the 'x' in 'x + y' refers to the local variable 'x'
in the first RHS function. Someone writing this is more likely to have
meant

(1 |> \(x) x + 1) |> \(y) x + y

with 'x' in 'x + y' now referring to a global variable:

> x <- 2
> 1 |> \(x) x + 1 |> \(y) x + y
[1] 3
> (1 |> \(x) x + 1) |> \(y) x + y
[1] 4

These issues arise with any approach in R that allows a bare function
expression on the RHS of a pipe operation. It also arises in other
languages with pipe operators. For example, here is the last example
in Julia:

julia> x = 2
2
julia> 1 |> x -> x + 1 |> y -> x + y
3
julia> ( 1 |> x -> x + 1 ) |> y -> x + y
4

Even though proper use of parentheses can work around these issues,
the likelihood of making mistakes that are hard to track down is too
high. So we will disallow the use of bare function expressions on the
right hand side of a pipe.

Best,

luke

--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] From .Fortran to .Call?

2020-12-22 Thread Balasubramanian Narasimhan
To deal with such Fortran issues in several packages I deal with, I 
wrote the SUtools package (https://github.com/bnaras/SUtools) that you 
can try.  The current version generates the registration assuming 
implicit Fortran naming conventions though. (I've been meaning to 
upgrade it to use the gfortran -fc-prototypes-external flag which should 
be easy; I might just finish that during these holidays.)


There's a vignette as well: 
https://bnaras.github.io/SUtools/articles/SUtools.html


-Naras


On 12/19/20 9:53 AM, Ivan Krylov wrote:

On Sat, 19 Dec 2020 17:04:59 +
"Koenker, Roger W"  wrote:


There are comments in various places, including R-extensions §5.4
suggesting that .Fortran is (nearly) deprecated and hinting that use
of .Call is more efficient and now preferred for packages.

My understanding of §5.4 and 5.5 is that explicit routine registration
is what's important for efficiency, and your package already does that
(i.e. calls R_registerRoutines()). The only two things left to add
would be types (REALSXP/INTSXP/...) and styles (R_ARG_IN,
R_ARG_OUT/...) of the arguments of each subroutine.

Switching to .Call makes sense if you want to change the interface of
your native subroutines to accept arbitrary heavily structured SEXPs
(and switching to .External could be useful if you wanted to play with
evaluation of the arguments).



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] From .Fortran to .Call?

2020-12-22 Thread Avraham Adler
Hello, Ivan.

I used .Call instead of .Fortran in the Delaporte package [1]. What
helped me out a lot was Drew Schmidt's Romp examples and descriptions
[2]. If you are more comfortable with the older Fortran interface,
Tomasz Kalinowski has a package which uses Fortran 2018 more
efficiently [3]. I haven't tried this last in practice, however.

Hope that helps,

Avi

[1] https://CRAN.R-project.org/package=Delaporte
[2] https://github.com/wrathematics/Romp
[3] https://github.com/t-kalinowski/RFI

Tomasz Kalinowski



On Tue, Dec 22, 2020 at 7:24 PM Balasubramanian Narasimhan
 wrote:
>
> To deal with such Fortran issues in several packages I deal with, I
> wrote the SUtools package (https://github.com/bnaras/SUtools) that you
> can try.  The current version generates the registration assuming
> implicit Fortran naming conventions though. (I've been meaning to
> upgrade it to use the gfortran -fc-prototypes-external flag which should
> be easy; I might just finish that during these holidays.)
>
> There's a vignette as well:
> https://bnaras.github.io/SUtools/articles/SUtools.html
>
> -Naras
>
>
> On 12/19/20 9:53 AM, Ivan Krylov wrote:
> > On Sat, 19 Dec 2020 17:04:59 +
> > "Koenker, Roger W"  wrote:
> >
> >> There are comments in various places, including R-extensions §5.4
> >> suggesting that .Fortran is (nearly) deprecated and hinting that use
> >> of .Call is more efficient and now preferred for packages.
> > My understanding of §5.4 and 5.5 is that explicit routine registration
> > is what's important for efficiency, and your package already does that
> > (i.e. calls R_registerRoutines()). The only two things left to add
> > would be types (REALSXP/INTSXP/...) and styles (R_ARG_IN,
> > R_ARG_OUT/...) of the arguments of each subroutine.
> >
> > Switching to .Call makes sense if you want to change the interface of
> > your native subroutines to accept arbitrary heavily structured SEXPs
> > (and switching to .External could be useful if you wanted to play with
> > evaluation of the arguments).
> >
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel