Thank you for the explanation Duncan - very interesting indeed! I wonder if someone in the list might know to answer your question regarding the double duplication.
Best, Tal ----------------Contact Details:------------------------------------------------------- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) ---------------------------------------------------------------------------------------------- On Wed, Sep 1, 2010 at 6:39 PM, Duncan Murdoch <murdoch.dun...@gmail.com>wrote: > On 01/09/2010 11:09 AM, Tal Galili wrote: > >> Hello all, >> >> A friend recently brought to my attention that vector assignment actually >> recreates the entire vector on which the assignment is performed. >> >> So for example, the code: >> x[10]<- NA # The original call (short version) >> >> Is really doing this: >> x<- replace(x, list=10, values=NA) # The original call (long version) >> # assigning a whole new vector to x >> >> Which is actually doing this: >> x<- `[<-`(x, list=10, values=NA) # The actual call >> >> >> Assuming this can be explained reasonably to the lay man, my question is, >> why is it done this way ? >> >> > > Your friend misled you. The `[<-` function is primitive. It acts as > though it does what you describe, but it is free to do internal > optimizations, and in many cases it does. The replace() function is a > regular R-level function so it has much less freedom and is likely to be a > lot less efficient. > > For example, in evaluating the expression x[10] <- NA, in most cases R > knows that the original vector x will never be needed again, so it won't be > duplicated. But in evaluating > > > replace(x, list=10, values=NA) > > R can't be sure, so it would make a duplicate copy. > > You can see the difference in the following code: > > > x <- 1:1000 > > tracemem(x) > [1] "<0x0547a6c0>" > > x[10] <- NA > > > x <- replace(x, list=10, values=NA) > tracemem[0x0547a6c0 -> 0x0488a768]: replace > > Only the second version caused x to be duplicated. > > One example that looks as though it is doing unnecessary duplication is > this: > > > x[10] <- 3 > tracemem[0x0488a768 -> 0x04881260]: > tracemem[0x04881260 -> 0x05613368]: > > I can see that one duplication is necessary (x is being changed from type > integer to type double), but why two? > > Duncan Murdoch > > > Why won't it just change the relevant pointer in memory? >> >> > > > On small vectors it makes no difference. >> But on big vectors this might be (so I suspect) costly (in terms of time). >> >> >> I'm curious for your responses on the subject. >> >> Best, >> Tal >> >> >> >> ----------------Contact >> Details:------------------------------------------------------- >> Contact me: tal.gal...@gmail.com | 972-52-7275845 >> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | >> www.r-statistics.com (English) >> >> ---------------------------------------------------------------------------------------------- >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.