Re: [Rd] Random behavior of mclapply

2018-10-22 Thread Thibault Vatter
Hi Tomas,

Thanks a lot for the explanation and the changes. The update in the
documentation is especially helpful.

Best,
Thibault




On Thu, Oct 18, 2018 at 10:48 AM Tomas Kalibera 
wrote:

>
> Hi Thibault,
>
> mclapply has been designed to signal an error in two ways. User code
> errors are returned as special objects (of class "try-error") in the
> respective element of the result list. All other errors (including a
> process killed) are returned as NULL in the respective elements of the
> result list. To detect these errors reliably, one needs to implement FUN
> so that it never returns NULL normally (also it cannot return a raw
> vector). This is how mclapply was designed and implemented (and also
> mccollect, etc). It may be surprising to see multiple NULL elements when
> a single process is killed, but this is expected with pre-scheduling
> when that process has been tasked to compute multiple elements.
>
> To make this API more user friendly, I've added a warning that is now
> emitted when a job does not deliver a result (that is, when a vector
> element is NULL because of such error). I've also made it more explicit
> in the documentation that NULL signals an error.
>
> Best,
> Tomas
>
>
> On 07/26/2018 08:37 PM, Thibault Vatter wrote:
> > Hi,
> >
> > I wondered about the behavior described in the following stackoverflow
> > question:
> >
> >
> https://stackoverflow.com/questions/20674538/mclapply-returns-null-randomly
> >
> > More specifically, I would like to know if you ever considered the
> > suggestion made in the comments of the first answer, namely to somehow
> warn
> > the user if one of the processes has been killed by the out-of-memory
> > killer ?
> >
> > I am always surprised to see the random NULLs without
> message/warning/error
> > of any kind, and I think that it could be a useful feature to know
> whether
> > the function executed by mclapply returned a NULL or if the process was
> > killed for some reason.
> >
> > In the following gist, I have an example of this (in this case
> non-random)
> > behavior:
> >
> > https://gist.github.com/tvatter/2fcf3a9a99c256f9b9360f596b300715
> >
> > For the record, I generate the list of NULLs in the 4th mclapply in the
> > girst above with a late 2013 macbook pro with macOS High Sierra, 16GB of
> > memory, and my sessionInfo() is:
> >
> > R version 3.5.0 (2018-04-23)
> > Platform: x86_64-apple-darwin16.7.0 (64-bit)
> > Running under: macOS High Sierra 10.13.6
> >
> > Matrix products: default
> > BLAS:
> >
> /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
> > LAPACK:
> >
> /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libLAPACK.dylib
> >
> > locale:
> > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> >
> > attached base packages:
> > [1] parallel  stats graphics  grDevices utils datasets  methods
> >   base
> >
> > loaded via a namespace (and not attached):
> > [1] compiler_3.5.0 tools_3.5.0yaml_2.1.19
> >
> > 
> > Thibault Vatter
> > Department of Statistics
> > Columbia University
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] v3 serialization of compact_intseq altrep should write modified data

2018-10-22 Thread Michael Sannella via R-devel
Experimenting with altrep objects and v3 serialization, I discovered a
possible bug.  Calling DATAPTR on a compact_intseq object returns a
pointer to the expanded integer sequence in memory.  If you modify
this data, the object values appear to be changed.  However, if the
compact_intseq object is then serialized (with version=3), only the
original integer sequence info is written.

For example, suppose I have compiled and loaded the following C code:
  SEXP set_intseq_data(SEXP x)
  {
  void* ptr = DATAPTR(x);
  ((int*)ptr)[3] = 1234;
  return R_NilValue;
  }

I see the following behavior in R 3.5.1:
  > x <- 1:10
  > x
   [1]  1  2  3  4  5  6  7  8  9 10
  > .Call("set_intseq_data", x)
  NULL
  > x
   [1]123 123456789   10
  > save(x, file="temp.rda", version=3)
  > load(file="temp.rda")
  > x
   [1]  1  2  3  4  5  6  7  8  9 10
  >

I would have expected the modified vector data to be serialized to the
file, and be restored when it is loaded.

  ~~ Michael Sannella

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] v3 serialization of compact_intseq altrep should write modified data

2018-10-22 Thread Tierney, Luke
Try this C code:

SEXP set_intseq_data(SEXP x)
{
 if (MAYBE_SHARED(x))
error("Oops, not supposed to do this!");
 void* ptr = DATAPTR(x);
 ((int*)ptr)[3] = 1234;
 return R_NilValue;
}

Lots of things will break if you modify objects that have been marked
as immutable (and hence where MAYBE_SHARED returns TRUE).

For now the implementation of compact sequences marks them as
immutable and so assumes the expanded version will not be changed.
That implementation detail might be changed at some point but C code
should not make assumptions.

Best,

luke

On Mon, 22 Oct 2018, Michael Sannella via R-devel wrote:

> Experimenting with altrep objects and v3 serialization, I discovered a
> possible bug.  Calling DATAPTR on a compact_intseq object returns a
> pointer to the expanded integer sequence in memory.  If you modify
> this data, the object values appear to be changed.  However, if the
> compact_intseq object is then serialized (with version=3), only the
> original integer sequence info is written.
>
> For example, suppose I have compiled and loaded the following C code:
>  SEXP set_intseq_data(SEXP x)
>  {
>  void* ptr = DATAPTR(x);
>  ((int*)ptr)[3] = 1234;
>  return R_NilValue;
>  }
>
> I see the following behavior in R 3.5.1:
>  > x <- 1:10
>  > x
>   [1]  1  2  3  4  5  6  7  8  9 10
>  > .Call("set_intseq_data", x)
>  NULL
>  > x
>   [1]123 123456789   10
>  > save(x, file="temp.rda", version=3)
>  > load(file="temp.rda")
>  > x
>   [1]  1  2  3  4  5  6  7  8  9 10
>  >
>
> I would have expected the modified vector data to be serialized to the
> file, and be restored when it is loaded.
>
>  ~~ Michael Sannella
>
>   [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel