On Mar 2, 2014, at 12:37 PM, Jens Oehlschlägel 
<jens.oehlschlae...@truecluster.com> wrote:

> Dear core group,
> 
> Which operation in R guarantees to get a true copy of an atomic vector, not 
> just a second symbol pointing to the same shared memory?
> 

None, there is no concept of "shared" memory at R level. You seem to be mixing 
C level API specifics and the R language. In the former duplicate() creates a 
new copy.


> y <- x[]
> #?
> 
> y <- x
> y[1] <- y[1]
> #?
> 
> Is there any function that returns its argument as a non-shared atomic but 
> only copies if the argument was shared?
> 
> Given an atomic vector x, what is the best official way to find out whether 
> other symbols share the vector RAM? Querying NAMED() < 2 doesn't work because 
> .Call sets sxpinfo_struct.named to 2. It even sets it to 2 if the argument to 
> .Call was a never-named expression!?
> 
> > named(1:3)
> [1] 2
> 

Assuming that you are talking about the C API, please consider reading about 
the concepts involved. .Call() doesn't set named to 2 at all - it passes 
whatever object is passed so it is the C code's responsibility to handle 
incoming objects according to the desired semantics (see the previous post 
here). 


> And it seems to set it permanently, pure read-access can trigger 
> copy-on-modify:
> 
> > x <- integer(1e8)
> > system.time(x[1]<-1L)
>       User      System verstrichen
>          0           0           0
> > system.time(x[1]<-2L)
>       User      System verstrichen
>          0           0           0
> 
> having called .Call now leads to an unnecessary copy on the next assignment
> 
> > named(x)
> [1] 2
> > system.time(x[1]<-3L)
>       User      System verstrichen
>       0.14        0.07        0.20
> > system.time(x[1]<-4L)
>       User      System verstrichen
>          0           0           0
> 
> this not only happens with user written functions doing read-access
> 
> > is.unsorted(x)
> [1] TRUE
> > system.time(x[1]<-5L)
>       User      System verstrichen
>       0.11        0.09        0.21
> 
> Why don't you simply give package authors read-access to sxpinfo_struct.named 
> in .Call (without setting it to 2)? That would give us more control and also 
> save some unnecessary copying.

Again, you're barking up the wrong tree - .Call() doesn't bump NAMED at all - 
it simply passes the object:

#include <Rinternals.h>
SEXP nam(SEXP x) { return ScalarInteger(NAMED(x)); }

> .Call("nam", 1+1)
[1] 0
> x=1+1
> .Call("nam", x)
[1] 1
> y=x
> .Call("nam", x)
[1] 2

Cheers,
Simon




> I guess once R switches to reference-counting preventive increasing in .Call 
> could not be continued anyhow.
> 
> Kind regards
> 
> 
> Jens Oehlschlägel
> 
> P.S. please cc me in answers as I am not member of r-devel
> 
> 
> P.P.S. function named() was tentatively defined as follows:
> 
> named <- function(x)
>  .Call("R_bit_named", x, PACKAGE="bit")
> 
> SEXP R_bit_named(SEXP x){
>  SEXP ret_;
>  PROTECT( ret_ = allocVector(INTSXP,1) );
>  INTEGER(ret_)[0] = NAMED(x);
>  UNPROTECT(1);
>  return ret_;
> }
> 
> 
> > version
>               _
> platform       x86_64-w64-mingw32
> arch           x86_64
> os             mingw32
> system         x86_64, mingw32
> status         Under development (unstable)
> major          3
> minor          1.0
> year           2014
> month          02
> day            28
> svn rev        65091
> language       R
> version.string R Under development (unstable) (2014-02-28 r65091)
> nickname       Unsuffered Consequences
> 
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to