Thank you for the explanation Duncan - very interesting indeed!

I wonder if someone in the list might know to answer your question regarding
the double duplication.

Best,
Tal

----------------Contact
Details:-------------------------------------------------------
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
----------------------------------------------------------------------------------------------




On Wed, Sep 1, 2010 at 6:39 PM, Duncan Murdoch <murdoch.dun...@gmail.com>wrote:

> On 01/09/2010 11:09 AM, Tal Galili wrote:
>
>> Hello all,
>>
>> A friend recently brought to my attention that vector assignment actually
>> recreates the entire vector on which the assignment is performed.
>>
>> So for example, the code:
>> x[10]<- NA # The original call (short version)
>>
>> Is really doing this:
>> x<- replace(x, list=10, values=NA) # The original call (long version)
>> # assigning a whole new vector to x
>>
>> Which is actually doing this:
>> x<- `[<-`(x, list=10, values=NA) # The actual call
>>
>>
>> Assuming this can be explained reasonably to the lay man, my question is,
>> why is it done this way ?
>>
>>
>
> Your friend misled you.  The `[<-` function is primitive.  It acts as
> though it does what you describe, but it is free to do internal
> optimizations, and in many cases it does.  The replace() function is a
> regular R-level function so it has much less freedom and is likely to be a
> lot less efficient.
>
> For example, in evaluating the expression x[10] <- NA, in most cases R
> knows that the original vector x will never be needed again, so it won't be
> duplicated.  But in evaluating
>
>
> replace(x, list=10, values=NA)
>
> R can't be sure, so it would make a duplicate copy.
>
> You can see the difference in the following code:
>
> > x <- 1:1000
> > tracemem(x)
> [1] "<0x0547a6c0>"
> > x[10] <- NA
>
> > x <- replace(x, list=10, values=NA)
> tracemem[0x0547a6c0 -> 0x0488a768]: replace
>
> Only the second version caused x to be duplicated.
>
> One example that looks as though it is doing unnecessary duplication is
> this:
>
> > x[10] <- 3
> tracemem[0x0488a768 -> 0x04881260]:
> tracemem[0x04881260 -> 0x05613368]:
>
> I can see that one duplication is necessary (x is being changed from type
> integer to type double), but why two?
>
> Duncan Murdoch
>
>
>  Why won't it just change the relevant pointer in memory?
>>
>>
>
>
>  On small vectors it makes no difference.
>> But on big vectors this might be (so I suspect) costly (in terms of time).
>>
>>
>> I'm curious for your responses on the subject.
>>
>> Best,
>> Tal
>>
>>
>>
>> ----------------Contact
>> Details:-------------------------------------------------------
>> Contact me: tal.gal...@gmail.com |  972-52-7275845
>> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
>> www.r-statistics.com (English)
>>
>> ----------------------------------------------------------------------------------------------
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to