from:"Sun Yijiang"

[Rd] Rscript fails with some packages (for example, h5)

2017-12-26 Thread Sun Yijiang

Consider this script (with h5 installed):

$ cat test.R
library(h5)
name <- tempfile()
f <- h5file(name)
file.remove(name)

$ Rscript test.R
Error in initialize(value, ...) :
  cannot use object of class "character" in new():  class "H5File" does not
extend that class
Calls: h5file -> new -> initialize -> initialize
Execution halted

$ /usr/lib64/R/bin/R --slave --no-restore --file=test.R
[1] TRUE

$ R_DEFAULT_PACKAGES= Rscript test.R
[1] TRUE

$ R
> source('test.R')
>

$ R_DEFAULT_PACKAGES='datasets,utils,grDevices,graphics,stats' R
> source('test.R')
Error in initialize(value, ...) :
  cannot use object of class "character" in new():  class "H5File" does not
extend that class
>

After looking into C source code, I found that Rscript by default fills
environment variable R_DEFAULT_PACKAGES with
"datasets,utils,grDevices,graphics,stats", and it somehow fails some
package like h5.

The problem here is, not setting R_DEFAULT_PACKAGES is equivalent to
setting it to a magic value, it's really confusing.  I suggest remove this
feature.

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Rscript fails with some packages (for example, h5)

2017-12-26 Thread Sun Yijiang

Hi Dirk,

Thanks for the solution.  Now I know the work-arounds, but still don't
quite get it. Why does R_DEFAULT_PACKAGES has anything to do with
library(methods)?  If library(h5) works, it should just work, not depend on
an environment variable.  Rscript is not consistent with R, that's my
confusion.

Steve

2017-12-26 20:46 GMT+08:00 Dirk Eddelbuettel :

>
> On 26 December 2017 at 15:24, Sun Yijiang wrote:
> | After looking into C source code, I found that Rscript by default fills
> | environment variable R_DEFAULT_PACKAGES with
> | "datasets,utils,grDevices,graphics,stats", and it somehow fails some
> | package like h5.
> |
> | The problem here is, not setting R_DEFAULT_PACKAGES is equivalent to
> | setting it to a magic value, it's really confusing.  I suggest remove
> this
> | feature.
>
> The more confusing part is that "methods" is missing 'by design' (as
> loading
> methods is marginally more expensive that other packages). Ie for your
> script
>
>edd@bud:/tmp$ cat h5ex.R
>library(methods)
>library(h5)
>name <- tempfile()
>f <- h5file(name)
>file.remove(name)
>edd@bud:/tmp$ Rscript h5ex.R
>[1] TRUE
>edd@bud:/tmp$
>
> it all works if you just add `library(methods)` as seen in the first line.
>
> For what it is worth, littler's r does not need that as it loads methods
> just
> like R itself does avoiding the confusion:
>
>edd@bud:/tmp$ cat h5ex2.R
>library(h5)
>name <- tempfile()
>f <- h5file(name)
>file.remove(name)
>edd@bud:/tmp$ r h5ex2.R
>edd@bud:/tmp$
>
> Dirk
>
> --
> http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Rscript fails with some packages (for example, h5)

2017-12-27 Thread Sun Yijiang

Thanks for the details. I’m new to R, and I’m not blaming anything here,
just that I’m still not clear what good it makes to keep this inconsistency
between R and Rscript. To me (and probably to many others from Perl/Python
etc.), this is shockingly weird. I can live with that, and I also want to
know why.

Steve
Dirk Eddelbuettel 于2017年12月27日 周三21:15写道：

>
> Duncan,
>
> Very nice tutorial. However it does NOT take away from the fact that _very_
> simple_ scripts (like the one posted by Sun at the beginning of this
> thread)
> simply _fail_ in error under Rscript.
>
> Whereas they don't under R or r.
>
> The R environment ships an interpreter meant for command-line and scripting
> use which fails on simple scripts that happen to use S4. But I am tired of
> arguing for reversing this as I have gotten nowhere in all those years.
>
> Dirk
>
> --
> http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Race condition on parallel package's mcexit and rmChild

2019-05-20 Thread Sun Yijiang

I've been hacking with parallel package for some time and built a
parallel processing framework with it.  However, although very rarely,
I did notice "ignoring SIGPIPE signal" error every now and then.
After a deep dig into the source code, I think I found something worth
noticing.

In short, wring to pipe in the C function mc_exit(SEXP sRes) may cause
a SIGPIPE.  Code from src/library/parallel/src/fork.c:

SEXP NORET mc_exit(SEXP sRes)
{
int res = asInteger(sRes);
... ...
if (master_fd != -1) { /* send 0 to signify that we're leaving */
size_t len = 0;
/* assign result for Fedora security settings */
ssize_t n = write(master_fd, &len, sizeof(len));
... ...
}

So a pipe write is made in mc_exit, and here's how this function is
used in src/library/parallel/R/unix/mcfork.R:

mcexit <- function(exit.code = 0L, send = NULL)
{
if (!is.null(send)) try(sendMaster(send), silent = TRUE)
.Call(C_mc_exit, as.integer(exit.code))
}

Between sendMaster() and mc_exit() calls, which are made in the child
process, the master process may call readChild() followed by
rmChild().  rmChild closes the pipe on the master side, and if it's
called before child calls mc_exit, a SIGPIPE will be raised when child
tries to write to the pipe in mc_exit.

rmChild is defined but not used in parallel package, so this problem
won't surface in most cases.  However, it is a useful API and may be
used by users like me for advanced control over child processes.  I
hope we can discuss a solution on it.

In fact, I don't see why we need to write to the pipe on child exit
and how it has anything to do with "Fedora security settings" as
suggested in the comments.  Removing it, IMHO, would be a good and
clean way to solve this problem.

Regards,
Yijiang

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Race condition on parallel package's mcexit and rmChild

2019-05-20 Thread Sun Yijiang

I've been hacking with parallel package for some time and built a
parallel processing framework with it.  However, although very rarely,
I did notice "ignoring SIGPIPE signal" error every now and then.
After a deep dig into the source code, I think I found something worth
noticing.

In short, wring to pipe in the C function mc_exit(SEXP sRes) may cause
a SIGPIPE.  Code from src/library/parallel/src/fork.c:

SEXP NORET mc_exit(SEXP sRes)
{
int res = asInteger(sRes);
... ...
if (master_fd != -1) { /* send 0 to signify that we're leaving */
size_t len = 0;
/* assign result for Fedora security settings */
ssize_t n = write(master_fd, &len, sizeof(len));
... ...
}

So a pipe write is made in mc_exit, and here's how this function is
used in src/library/parallel/R/unix/mcfork.R:

mcexit <- function(exit.code = 0L, send = NULL)
{
if (!is.null(send)) try(sendMaster(send), silent = TRUE)
.Call(C_mc_exit, as.integer(exit.code))
}

Between sendMaster() and mc_exit() calls, which are made in the child
process, the master process may call readChild() followed by
rmChild().  rmChild closes the pipe on the master side, and if it's
called before child calls mc_exit, a SIGPIPE will be raised when child
tries to write to the pipe in mc_exit.

rmChild is defined but not used in parallel package, so this problem
won't surface in most cases.  However, it is a useful API and may be
used by users like me for advanced control over child processes.  I
hope we can discuss a solution on it.

In fact, I don't see why we need to write to the pipe on child exit
and how it has anything to do with "Fedora security settings" as
suggested in the comments.  Removing it, IMHO, would be a good and
clean way to solve this problem.

Regards,
Yijiang

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Race condition on parallel package's mcexit and rmChild

2019-05-20 Thread Sun Yijiang

Have read the latest code, but I still don't understand why mc_exit
needs to write zero on exit.  If a child closes its pipe, parent will
know that on next select.

Best,
Yijiang

Tomas Kalibera  于2019年5月20日周一 下午10:52写道：
>
> This issue has already been addressed in 76462 (R-devel) and also ported
> to R-patched. In fact rmChild() is used in mccollect(wait=FALSE).
>
> Best
> Tomas
>
> On 5/19/19 11:39 AM, Sun Yijiang wrote:
> > I've been hacking with parallel package for some time and built a
> > parallel processing framework with it.  However, although very rarely,
> > I did notice "ignoring SIGPIPE signal" error every now and then.
> > After a deep dig into the source code, I think I found something worth
> > noticing.
> >
> > In short, wring to pipe in the C function mc_exit(SEXP sRes) may cause
> > a SIGPIPE.  Code from src/library/parallel/src/fork.c:
> >
> > SEXP NORET mc_exit(SEXP sRes)
> > {
> >  int res = asInteger(sRes);
> > ... ...
> >  if (master_fd != -1) { /* send 0 to signify that we're leaving */
> >  size_t len = 0;
> >  /* assign result for Fedora security settings */
> >  ssize_t n = write(master_fd, &len, sizeof(len));
> > ... ...
> > }
> >
> > So a pipe write is made in mc_exit, and here's how this function is
> > used in src/library/parallel/R/unix/mcfork.R:
> >
> > mcexit <- function(exit.code = 0L, send = NULL)
> > {
> >  if (!is.null(send)) try(sendMaster(send), silent = TRUE)
> >  .Call(C_mc_exit, as.integer(exit.code))
> > }
> >
> > Between sendMaster() and mc_exit() calls, which are made in the child
> > process, the master process may call readChild() followed by
> > rmChild().  rmChild closes the pipe on the master side, and if it's
> > called before child calls mc_exit, a SIGPIPE will be raised when child
> > tries to write to the pipe in mc_exit.
> >
> > rmChild is defined but not used in parallel package, so this problem
> > won't surface in most cases.  However, it is a useful API and may be
> > used by users like me for advanced control over child processes.  I
> > hope we can discuss a solution on it.
> >
> > In fact, I don't see why we need to write to the pipe on child exit
> > and how it has anything to do with "Fedora security settings" as
> > suggested in the comments.  Removing it, IMHO, would be a good and
> > clean way to solve this problem.
> >
> > Regards,
> > Yijiang
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Infrequent but steady NULL-pointer caused segfault in as.POSIXlt.POSIXct (R 3.4.4)

2019-08-02 Thread Sun Yijiang

The R script I run daily for hours looks like this:

while (!finish) {
Sys.sleep(0.1)
time = as.integer(format(Sys.time(), "%H%M")) # always crash here
if (new.data.timestamp() <= time)
next
# ... do some jobs for about 2 minutes ...
gc()
}

Basically it waits for new data, which comes in every 10 minutes, and
do some jobs, then gc(), then loop again.  It works great most of the
time, but crashes strangely once a month or so.  Although infrequent,
it always crashes at the same place and gives the same error info,
like this:

 *** caught segfault ***
address (nil), cause 'memory not mapped'

Traceback:
 1: as.POSIXlt.POSIXct(x, tz)
 2: as.POSIXlt(x, tz)
 3: format.POSIXlt(as.POSIXlt(x, tz), format, usetz, ...)
 4: structure(format.POSIXlt(as.POSIXlt(x, tz), format, usetz, ...),
  names = names(x))
 5: format.POSIXct(Sys.time(), format = "%H%M")
 6: format(Sys.time(), format = "%H%M")
 7: format(Sys.time(), format = "%H%M")
… …

I looked into the dumped core with gdb, and found something very strange:

gdb /usr/lib64/R/bin/exec/R ~/core.30387
(gdb) bt 5
#0  0x7f1dca844ff1 in __strlen_sse2_pminub () from /lib64/libc.so.6
#1  0x7f1dcb20e8f9 in Rf_mkChar (name=0x0) at envir.c:3725
#2  0x7f1dcb1dc225 in do_asPOSIXlt (call=,
op=, args=,
env=) at datetime.c:705
#3  0x7f1dcb22197f in bcEval (body=body@entry=0x4064b28,
rho=rho@entry=0xc449d38, useCache=useCache@entry=TRUE)
at eval.c:6473
#4  0x7f1dcb230370 in Rf_eval (e=0x4064b28,
rho=rho@entry=0xc449d38) at eval.c:624
(More stack frames follow…)

Tracing into src/main/datetime.c:705, it’s a simple string-making code:
SET_STRING_ELT(tzone, 1, mkChar(R_tzname[0]));

mkChar function is defined in envir.c:3725:
3723  SEXP mkChar(const char *name)
3724  {
3725  size_t len =  strlen(name);
… …

gdb shows that the string pointer (name=0x0) mkChar received is NULL,
and subsequently strlen(NULL) caused the segfault.  But quite
contradictorily, gdb shows the value passed to mkChar in the caller is
valid:

(gdb) frame 2
#2  0x7f1dcb1dc225 in do_asPOSIXlt (call=,
op=, args=,
env=) at datetime.c:705
705 datetime.c: No such file or directory.
(gdb) p tzname[0]
$1 = 0x4cf39c0 “CST”

R_tzname is an alias of tzname. (#define R_tzname tzname in the same file.)

At first, I suspect that some library may have messed up the memory
and accidentally zeroed tzname (a global variable).  But with this gdb
trace, it shows that tzname is good, only that the pointer passed to
mkChar magically changed to zero.  Like this:

mkChar(tzname[0])  // tzname[0] is “CST”, address 0x4cf39c
… …
SEXP mkChar(const char *name)  // name should be 0x4cf39c, but gdb shows 0x0
{
size_t len =  strlen(name);  // segfault, as name is NULL
… …

The only theory I can think of so far is that, on calling mkChar, the
parameter passed on stack somehow got wiped out to zero by some buggy
code in R or library.  At a higher level, what I see is this:  If you
run format(Sys.time(), "%H%M”) a million times a day (together with
other codes of course), once in a month or so this simple line can
segfault.

I’m lost in this confusion, could someone please help me find the
right direction to further look into this problem?

Regards,
Steve

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Infrequent but steady NULL-pointer caused segfault in as.POSIXlt.POSIXct (R 3.4.4)

2019-08-04 Thread Sun Yijiang

A reply from stackoverflow suggests I might have hit this bug:

https://sourceware.org/bugzilla/show_bug.cgi?id=14023

I can confirm that this glibc bug affects my system (latest CentOS 7).
However, as far as I know, R is not multithreaded in its core.  Is it
possible that some library triggered this?

Regards,
Steve

Tomas Kalibera  于2019年8月2日周五 下午4:59写道：
>
> In an optimized build, debug info is just an approximation. It might
> help to debug in a build of R and packages without compiler
> optimizations (-O0), where the debug information is accurate. However,
> first I would try to modify the example to trigger more often, or try to
> find external ways to make it trigger more often (e.g. via gctorture).
> Then I would try to make the example smaller (not call gc() explicitly,
> not call any external code - e.g. the jobs, etc) - any time the example
> is reduced but still triggers the errors, the reasoning is made easier.
> Once you have a repeatable situation in a build with reliable debug
> symbols, debugging is easier too, e.g. sometimes a watchpoint helps to
> find memory corruption. Please feel free to ask more when you have more
> information/updates. If this ends up being a bug in R, please report
> (and with a reproducible example, if it is not obvious from the source
> code).
>
> Best
> Tomas
>
>
> On 8/2/19 10:23 AM, Sun Yijiang wrote:
> > The R script I run daily for hours looks like this:
> >
> > while (!finish) {
> >  Sys.sleep(0.1)
> >  time = as.integer(format(Sys.time(), "%H%M")) # always crash here
> >  if (new.data.timestamp() <= time)
> >  next
> >  # ... do some jobs for about 2 minutes ...
> >  gc()
> > }
> >
> > Basically it waits for new data, which comes in every 10 minutes, and
> > do some jobs, then gc(), then loop again.  It works great most of the
> > time, but crashes strangely once a month or so.  Although infrequent,
> > it always crashes at the same place and gives the same error info,
> > like this:
> >
> >   *** caught segfault ***
> > address (nil), cause 'memory not mapped'
> >
> > Traceback:
> >   1: as.POSIXlt.POSIXct(x, tz)
> >   2: as.POSIXlt(x, tz)
> >   3: format.POSIXlt(as.POSIXlt(x, tz), format, usetz, ...)
> >   4: structure(format.POSIXlt(as.POSIXlt(x, tz), format, usetz, ...),
> >names = names(x))
> >   5: format.POSIXct(Sys.time(), format = "%H%M")
> >   6: format(Sys.time(), format = "%H%M")
> >   7: format(Sys.time(), format = "%H%M")
> > … …
> >
> > I looked into the dumped core with gdb, and found something very strange:
> >
> > gdb /usr/lib64/R/bin/exec/R ~/core.30387
> > (gdb) bt 5
> > #0  0x7f1dca844ff1 in __strlen_sse2_pminub () from /lib64/libc.so.6
> > #1  0x7f1dcb20e8f9 in Rf_mkChar (name=0x0) at envir.c:3725
> > #2  0x7f1dcb1dc225 in do_asPOSIXlt (call=,
> > op=, args=,
> >  env=) at datetime.c:705
> > #3  0x7f1dcb22197f in bcEval (body=body@entry=0x4064b28,
> > rho=rho@entry=0xc449d38, useCache=useCache@entry=TRUE)
> >  at eval.c:6473
> > #4  0x7f1dcb230370 in Rf_eval (e=0x4064b28,
> > rho=rho@entry=0xc449d38) at eval.c:624
> > (More stack frames follow…)
> >
> > Tracing into src/main/datetime.c:705, it’s a simple string-making code:
> > SET_STRING_ELT(tzone, 1, mkChar(R_tzname[0]));
> >
> > mkChar function is defined in envir.c:3725:
> > 3723  SEXP mkChar(const char *name)
> > 3724  {
> > 3725  size_t len =  strlen(name);
> > … …
> >
> > gdb shows that the string pointer (name=0x0) mkChar received is NULL,
> > and subsequently strlen(NULL) caused the segfault.  But quite
> > contradictorily, gdb shows the value passed to mkChar in the caller is
> > valid:
> >
> > (gdb) frame 2
> > #2  0x7f1dcb1dc225 in do_asPOSIXlt (call=,
> > op=, args=,
> >  env=) at datetime.c:705
> > 705 datetime.c: No such file or directory.
> > (gdb) p tzname[0]
> > $1 = 0x4cf39c0 “CST”
> >
> > R_tzname is an alias of tzname. (#define R_tzname tzname in the same file.)
> >
> > At first, I suspect that some library may have messed up the memory
> > and accidentally zeroed tzname (a global variable).  But with this gdb
> > trace, it shows that tzname is good, only that the pointer passed to
> > mkChar magically changed to zero.  Like this:
> >
> > mkChar(tzname[0])  // tzname[0] is “CST”, address 0x4cf39c
> > … …
> > SEXP mkChar(const char *name)  // name should be 0x4cf39c, but gdb shows 0x0
> > {
> >  si

[Rd] Rscript fails with some packages (for example, h5)

Re: [Rd] Rscript fails with some packages (for example, h5)

Re: [Rd] Rscript fails with some packages (for example, h5)

[Rd] Race condition on parallel package's mcexit and rmChild

[Rd] Race condition on parallel package's mcexit and rmChild

Re: [Rd] Race condition on parallel package's mcexit and rmChild

[Rd] Infrequent but steady NULL-pointer caused segfault in as.POSIXlt.POSIXct (R 3.4.4)

Re: [Rd] Infrequent but steady NULL-pointer caused segfault in as.POSIXlt.POSIXct (R 3.4.4)

8 matches

Site Navigation

Mail list logo

Footer information