Re: [Rd] Strange error messages from parallel::mcparallel family under 3.6.0

2019-05-07 Thread Tomas Kalibera
Thanks, fixed in R-devel and R-patched.

The error happens in the child process when it is already exiting, after 
it had delivered a result, so this should not cause any trouble in an 
unpatched version of R (apart from showing that message). It is specific 
to mccollect(wait=FALSE).

Best
Tomas

On 5/3/19 12:47 PM, Pavel Krivitsky wrote:
> Dear All,
>
> Since upgrading to 3.6.0, I've been getting a strange error messages
> from the child process when using mcparallel/mccollect. Before filing a 
> report in the Bugzilla, I want to figure out whether I had been doing 
> something wrong all this time and R 3.6.0 has exposed it, or whether 
> something else is going on.
>
> # Background #
>
> Ultimately, what I want to do is to be able to set a time limit for an
> expression to be evaluated that would be enforced even inside compiled
> code. (R.utils::withTimeout() uses base::setTimeLimit(), which can only
> enforce within R code.)
>
> # Implementation #
>
> The approach that my implementation, statnet.common::forkTimeout()
> (source attached for convenience), uses is to call mcparallel() to
> evaluate the expression in a child process, then mccollect() with
> wait=FALSE and a timeout to give it a chance to finish. If it runs past
> the timeout, the child process is killed and an onTimeout value is
> returned. (This only works on Unix-alikes, but it's better than
> nothing.)
>
> # The problem #
>
> Since 3.6.0---and I've tested fresh installs of 3.6.0 and 3.5.3 side-
> by-side---I've been getting strange messages.
>
> Running
>
>source("forkTimeout.R") # attached
>repeat print(forkTimeout({Sys.sleep(1);TRUE}, timeout=3))
>
> results in
>
> [1] TRUE
> [1] TRUE
> Error in mcexit(0L) : ignoring SIGPIPE signal
> [1] TRUE
> [1] TRUE
> Error in mcexit(0L) : ignoring SIGPIPE signal
> [1] TRUE
> [1] TRUE
> [1] TRUE
>
> until interrupted. Running
>
>options(error=traceback)
>repeat print(forkTimeout({Sys.sleep(1);TRUE}, timeout=3))
>
> results in sporadic messages of the form:
>
> Error in mcexit(0L) : ignoring SIGPIPE signal
> 6: selectChildren(jobs, timeout)
> 5: parallel::mccollect(child, wait = FALSE, timeout = timeout) at
> forkTimeout.R#75
> 4: withCallingHandlers(expr, warning = function(w)
> invokeRestart("muffleWarning"))
> 3: suppressWarnings(parallel::mccollect(child, wait = FALSE, timeout =
> timeout)) at forkTimeout.R#75
> 2: forkTimeout({
> Sys.sleep(1)
>  ...
> 1: print(forkTimeout({
> Sys.sleep(1)
>  ...
>
> So, these messages do not appear to prevent the child process from
> returning valid output, but I've never seen them before R 3.6.0, so I
> wonder if I am doing something wrong. Session info is also attached.
>
>   Thanks in advance,
>   Pavel
>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Staged installation fail on some file systems

2019-05-07 Thread Tomas Kalibera
Thanks for the report.  According to my reading, this use of "mv" is ok 
and the renameat2() call which the invocation of "mv" leads to is also 
ok and allowed by POSIX in this context. It could only fail with EEXIST 
if the target directory (path/pkg) was not empty. So far I've not been 
able to reproduce but we could fall back to copy like on Windows.


Best
Tomas


On 5/5/19 4:35 AM, Henrik Bengtsson wrote:

I'm observing that the new staged installation in R 3.6.0 can produce:

mv: cannot move
‘/home/alice/R/x86_64-pc-linux-gnu-library/3.6/00LOCK-codetools/00new/codetools’
to ‘/home/alice/R/x86_64-pc-linux-gnu-library/3.6/codetools’: File
exists
ERROR:   moving to final location failed

on some file systems.

# EXAMPLE

$ R --vanilla
R version 3.6.0 (2019-04-26) -- "Planting of a Tree"
Copyright (C) 2019 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
...


install.packages("codetools", repos="https://cran.r-project.org";)

Installing package into ‘/wynton/home/cbi/hb/R/x86_64-pc-linux-gnu-library/3.6’
(as ‘lib’ is unspecified)
trying URL 'https://cran.r-project.org/src/contrib/codetools_0.2-16.tar.gz'
Content type 'application/x-gzip' length 12996 bytes (12 KB)
==
downloaded 12 KB

* installing *source* package ‘codetools’ ...
** package ‘codetools’ successfully unpacked and MD5 sums checked
** using staged installation
** R
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
mv: cannot move
‘/home/alice/R/x86_64-pc-linux-gnu-library/3.6/00LOCK-codetools/00new/codetools’
to ‘/home/alice/R/x86_64-pc-linux-gnu-library/3.6/codetools’: File
exists
ERROR:   moving to final location failed
* removing ‘/home/alice/R/x86_64-pc-linux-gnu-library/3.6/codetools’

The downloaded source packages are in
‘/scratch/alice/Rtmp6UYDzu/downloaded_packages’
Warning message:
In install.packages("codetools", repos = "https://cran.r-project.org";) :
installation of package ‘codetools’ had non-zero exit status


# WORKAROUND

Disabling staged installation, for instance by setting environment
variable 'R_INSTALL_STAGED=false' avoids this problem.


# TROUBLESHOOTING

I think it comes down to the following call in src/library/tools/R/install.R:

   status <- system(paste("mv -f",
  shQuote(instdir),
  shQuote(dirname(final_instdir

https://github.com/wch/r-source/blob/d253331f578814f919f150ffdf1fe581618079a3/src/library/tools/R/install.R#L1645-L1647

which effectively does:

$ mkdir -p path/pkg  ## empty final destination placeholder(?)
$ mkdir -p path/to/pkg
$ mv -f path/to/pkg path

However, on one (and only one) of several systems I've tested, that
'mv' produce the error:

   mv: cannot move ‘path/to/pkg’ to ‘path/pkg’: File exists

This is on a BeeGFS parallel file system.  I cannot tell if that 'mv
-f' should work or not, or if it is even well defined.  FWIW, the
above 'mv' does indeed work if I switch to another folder that is
mounted on a different, NFS, file system, i.e. it is not kernel/OS
specific (here CentOS 7.6.1810).

If of any use, here's the 'strace' of the above 'mv':

$ strace mv -f path/to/pkg path
execve("/usr/bin/mv", ["mv", "-f", "path/to/pkg", "path"], [/* 118 vars */]) = 0
brk(NULL)   = 0xcf3000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x7fde2ceb1000
access("/etc/ld.so.preload", R_OK)  = -1 ENOENT (No such file or directory)
open("/usr/lib64/openmpi/lib/tls/x86_64/libselinux.so.1",
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/openmpi/lib/tls/x86_64", 0x7ffc6a3fb170) = -1 ENOENT
(No such file or directory)
open("/usr/lib64/openmpi/lib/tls/libselinux.so.1", O_RDONLY|O_CLOEXEC)
= -1 ENOENT (No such file or directory)
stat("/usr/lib64/openmpi/lib/tls", 0x7ffc6a3fb170) = -1 ENOENT (No
such file or directory)
open("/usr/lib64/openmpi/lib/x86_64/libselinux.so.1",
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/openmpi/lib/x86_64", 0x7ffc6a3fb170) = -1 ENOENT (No
such file or directory)
open("/usr/lib64/openmpi/lib/libselinux.so.1", O_RDONLY|O_CLOEXEC) =
-1 ENOENT (No such file or directory)
stat("/usr/lib64/openmpi/lib", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=96960, ...}) = 0
mmap(NULL, 96960, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fde2ce99000
close(3)= 0
open("/lib64/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\320i\0\0\0\0\0\0"...,
832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=155784, ...}) = 0
mmap(NULL, 2255184, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
0) = 0x7fde2ca6a000
mprotect(0x7fde2ca8e000, 2093056, PROT_NONE) = 0
mmap(0x7f

Re: [Rd] Staged installation fail on some file systems

2019-05-07 Thread Henrik Bengtsson
On Tue, May 7, 2019 at 9:05 AM Tomas Kalibera  wrote:
>
> Thanks for the report.  According to my reading, this use of "mv" is ok
> and the renameat2() call which the invocation of "mv" leads to is also
> ok and allowed by POSIX in this context. It could only fail with EEXIST
> if the target directory (path/pkg) was not empty. So far I've not been
> able to reproduce but we could fall back to copy like on Windows.

Thanks for looking into this.  The purpose of the pre-existing target
directory (path/pkg) is to act as a directory lock in order to lower
the risk for parallel installation to take place at the same - is that
correct?  For the same reason, you can't just do

mv path/pkg path/pkg-quickly-now
mv -f path/to/pkg path
rmdir path/pkg-quickly-now

because there's a potential race condition?

For efficiency, to avoid copying, could one do "atomic" moves one
layer down?  Something like:

mv path/to/pkg/* path/to/pkg/.* path/pkg
rmdir path/pkg

because, in this case, we know that path/pkg/ is empty.

/Henrik


>
> Best
> Tomas
>
>
> On 5/5/19 4:35 AM, Henrik Bengtsson wrote:
> > I'm observing that the new staged installation in R 3.6.0 can produce:
> >
> > mv: cannot move
> > ‘/home/alice/R/x86_64-pc-linux-gnu-library/3.6/00LOCK-codetools/00new/codetools’
> > to ‘/home/alice/R/x86_64-pc-linux-gnu-library/3.6/codetools’: File
> > exists
> > ERROR:   moving to final location failed
> >
> > on some file systems.
> >
> > # EXAMPLE
> >
> > $ R --vanilla
> > R version 3.6.0 (2019-04-26) -- "Planting of a Tree"
> > Copyright (C) 2019 The R Foundation for Statistical Computing
> > Platform: x86_64-pc-linux-gnu (64-bit)
> > ...
> >
> >> install.packages("codetools", repos="https://cran.r-project.org";)
> > Installing package into 
> > ‘/wynton/home/cbi/hb/R/x86_64-pc-linux-gnu-library/3.6’
> > (as ‘lib’ is unspecified)
> > trying URL 'https://cran.r-project.org/src/contrib/codetools_0.2-16.tar.gz'
> > Content type 'application/x-gzip' length 12996 bytes (12 KB)
> > ==
> > downloaded 12 KB
> >
> > * installing *source* package ‘codetools’ ...
> > ** package ‘codetools’ successfully unpacked and MD5 sums checked
> > ** using staged installation
> > ** R
> > ** byte-compile and prepare package for lazy loading
> > ** help
> > *** installing help indices
> > ** building package indices
> > ** testing if installed package can be loaded from temporary location
> > mv: cannot move
> > ‘/home/alice/R/x86_64-pc-linux-gnu-library/3.6/00LOCK-codetools/00new/codetools’
> > to ‘/home/alice/R/x86_64-pc-linux-gnu-library/3.6/codetools’: File
> > exists
> > ERROR:   moving to final location failed
> > * removing ‘/home/alice/R/x86_64-pc-linux-gnu-library/3.6/codetools’
> >
> > The downloaded source packages are in
> > ‘/scratch/alice/Rtmp6UYDzu/downloaded_packages’
> > Warning message:
> > In install.packages("codetools", repos = "https://cran.r-project.org";) :
> > installation of package ‘codetools’ had non-zero exit status
> >
> >
> > # WORKAROUND
> >
> > Disabling staged installation, for instance by setting environment
> > variable 'R_INSTALL_STAGED=false' avoids this problem.
> >
> >
> > # TROUBLESHOOTING
> >
> > I think it comes down to the following call in 
> > src/library/tools/R/install.R:
> >
> >status <- system(paste("mv -f",
> >   shQuote(instdir),
> >   shQuote(dirname(final_instdir
> >
> > https://github.com/wch/r-source/blob/d253331f578814f919f150ffdf1fe581618079a3/src/library/tools/R/install.R#L1645-L1647
> >
> > which effectively does:
> >
> > $ mkdir -p path/pkg  ## empty final destination placeholder(?)
> > $ mkdir -p path/to/pkg
> > $ mv -f path/to/pkg path
> >
> > However, on one (and only one) of several systems I've tested, that
> > 'mv' produce the error:
> >
> >mv: cannot move ‘path/to/pkg’ to ‘path/pkg’: File exists
> >
> > This is on a BeeGFS parallel file system.  I cannot tell if that 'mv
> > -f' should work or not, or if it is even well defined.  FWIW, the
> > above 'mv' does indeed work if I switch to another folder that is
> > mounted on a different, NFS, file system, i.e. it is not kernel/OS
> > specific (here CentOS 7.6.1810).
> >
> > If of any use, here's the 'strace' of the above 'mv':
> >
> > $ strace mv -f path/to/pkg path
> > execve("/usr/bin/mv", ["mv", "-f", "path/to/pkg", "path"], [/* 118 vars 
> > */]) = 0
> > brk(NULL)   = 0xcf3000
> > mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> > 0) = 0x7fde2ceb1000
> > access("/etc/ld.so.preload", R_OK)  = -1 ENOENT (No such file or 
> > directory)
> > open("/usr/lib64/openmpi/lib/tls/x86_64/libselinux.so.1",
> > O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
> > stat("/usr/lib64/openmpi/lib/tls/x86_64", 0x7ffc6a3fb170) = -1 ENOENT
> > (No such file or directory)
> > open("/usr/lib64/openmpi/lib/tls/libselinux.so.1", O_RDONLY|O_CLOEXEC)
> > = -1 ENOENT (No such file or d

[Rd] openblas

2019-05-07 Thread robin hankin
Hello, macosx 10.13.6, Rdevel  r76458

I'm trying to compile against openblas to reproduce an error on the
CRAN check page (my package is clean under winbuilder and all but one
of the checks).   I've downloaded and installed openblas 0.3.7 but I
am not 100% sure that it is being used by R.

Using

./configure --with-blas="-lopenblas"

Then running R to discover the PID I get:


Rd % lsof -p 17960|egrep -i blas

R   17960 rhankin  txtREG1,8 189224 33471762
/Users/rhankin/Rd/lib/R/lib/libRblas.dylib


But it is not clear to me how to interpret this.  Am I using openblas
as intended?  I suspect not, for I cannot reproduce the error.   Can
anyone advise?


hankin.ro...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] openblas

2019-05-07 Thread Peter Langfelder
I'm a linux guy so take the advice with a grain of salt... When you
run the configure script, look at the output at the end of the run, it
should either say

Options enabled:   shared BLAS, ...   (means you are using R BLAS)

or it should mention OpenBLAS in External libraries (meaning you are
using OpenBLAS).

You can also see the results in config.log of you search for BLAS.

Or you could use the old trick of replacing libRblas.dylib with a link
to the appropriate OpenBLAS dylib.

On Tue, May 7, 2019 at 5:01 PM robin hankin  wrote:
>
> Hello, macosx 10.13.6, Rdevel  r76458
>
> I'm trying to compile against openblas to reproduce an error on the
> CRAN check page (my package is clean under winbuilder and all but one
> of the checks).   I've downloaded and installed openblas 0.3.7 but I
> am not 100% sure that it is being used by R.
>
> Using
>
> ./configure --with-blas="-lopenblas"
>
> Then running R to discover the PID I get:
>
>
> Rd % lsof -p 17960|egrep -i blas
>
> R   17960 rhankin  txtREG1,8 189224 33471762
> /Users/rhankin/Rd/lib/R/lib/libRblas.dylib
>
>
> But it is not clear to me how to interpret this.  Am I using openblas
> as intended?  I suspect not, for I cannot reproduce the error.   Can
> anyone advise?
>
>
> hankin.ro...@gmail.com
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] openblas

2019-05-07 Thread Peter Langfelder
(CCing the R-devel list, maybe someone will have a better answer.)

To be honest, I don't know how to. I wasn't able to configure R to use
OpenBLAS using the configure script and options on my Linux Fedora system.
I configure it without external BLAS, then replace the libRblas.dylib (.so
in my case) with a link to the OpenBLAS dynamic link library.

Peter

On Tue, May 7, 2019 at 7:39 PM robin hankin  wrote:

> thanks Peter, I appreciate your advice here.
>
> ./configure --with-blas="-lopenblas"  --without-recommended-packages
>
>
> gives me
>
>
>
>
>   Interfaces supported:X11, aqua, tcltk
>
>   External libraries:  readline, curl
>
>   Additional capabilities: JPEG, NLS, ICU
>
>   Options enabled: shared BLAS, R profiling
>
>
>   Capabilities skipped:PNG, TIFF, cairo
>
>   Options not enabled: memory profiling
>
>
>   Recommended packages:no
>
>
>
> so it doesn't look like it's using the openBlas... how come the arguments
> to ./configure are being ignored?
>
>
>
> 
> hankin.ro...@gmail.com
>
> 
>
> 
>
>
> On Wed, May 8, 2019 at 1:04 PM Peter Langfelder <
> peter.langfel...@gmail.com> wrote:
>
>> I'm a linux guy so take the advice with a grain of salt... When you
>> run the configure script, look at the output at the end of the run, it
>> should either say
>>
>> Options enabled:   shared BLAS, ...   (means you are using R BLAS)
>>
>> or it should mention OpenBLAS in External libraries (meaning you are
>> using OpenBLAS).
>>
>> You can also see the results in config.log of you search for BLAS.
>>
>> Or you could use the old trick of replacing libRblas.dylib with a link
>> to the appropriate OpenBLAS dylib.
>>
>> On Tue, May 7, 2019 at 5:01 PM robin hankin 
>> wrote:
>> >
>> > Hello, macosx 10.13.6, Rdevel  r76458
>> >
>> > I'm trying to compile against openblas to reproduce an error on the
>> > CRAN check page (my package is clean under winbuilder and all but one
>> > of the checks).   I've downloaded and installed openblas 0.3.7 but I
>> > am not 100% sure that it is being used by R.
>> >
>> > Using
>> >
>> > ./configure --with-blas="-lopenblas"
>> >
>> > Then running R to discover the PID I get:
>> >
>> >
>> > Rd % lsof -p 17960|egrep -i blas
>> >
>> > R   17960 rhankin  txtREG1,8 189224 33471762
>> > /Users/rhankin/Rd/lib/R/lib/libRblas.dylib
>> >
>> >
>> > But it is not clear to me how to interpret this.  Am I using openblas
>> > as intended?  I suspect not, for I cannot reproduce the error.   Can
>> > anyone advise?
>> >
>> >
>> > hankin.ro...@gmail.com
>> >
>> > __
>> > R-devel@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] openblas

2019-05-07 Thread Dirk Eddelbuettel


On 7 May 2019 at 19:43, Peter Langfelder wrote:
| (CCing the R-devel list, maybe someone will have a better answer.)
| 
| To be honest, I don't know how to. I wasn't able to configure R to use
| OpenBLAS using the configure script and options on my Linux Fedora system.
| I configure it without external BLAS, then replace the libRblas.dylib (.so
| in my case) with a link to the OpenBLAS dynamic link library.

We have been doing this for nearly 20 years in Debian.  The configure call
when building R is

./configure --prefix=/usr   \
--with-cairo\
[... stuff omitted ...] \
$(atlas)\
$(lapack)   \
--enable-R-profiling\
--enable-R-shlib\
--enable-memory-profiling   \
--without-recommended-packages  \
--build $(buildarch)

where $(atlas) and $(lapack) these days simply are

atlas = --with-blas
[...]
lapack= --with-lapack

(and that used to be different on different architectures a long time
ago). As I recall the --enable-R-shlib is also helpful.

With that we have R using _external_ LAPACK and BLAS allowing us to switch
seamlessly between, inter alia, reference BLAS, ATLAS, OpenBLAS and MKL (see
my blog for the last one). MKL appeared to be marginally faster than OpenBLAS
at a larger installation footprint.

Full details are in this (somewhat sprawling, my bad) file:

  https://salsa.debian.org/edd/r-base/blob/master/debian/rules

Dirk

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel