On Fri, 5 Aug 2016, Winston Chang wrote:
My understanding is that R will not make copies of lists if there is
only one reference to the object. However, I've encountered a case
where R does make copies, even though (I think) there should be only
one reference to the object. I hope that someone could shed some light
on why this is happening.
I'll start with a simple example. Below, x is a list with one element,
and changing that element doesn't result in a copy. (We know this
because nothing is printed when we do the assignment after the
tracemem call.) This is as expected.
x <- list(1)
tracemem(x)
# [1] "<0x1149e08f8>"
x[[1]] <- 2
# (No output)
Similarly, modifying a list contained in a list doesn't result in a copy:
e <- list(x = list(1))
tracemem(e$x)
# [1] "<0x11b3a4b38>"
e$x[[1]] <- 2
# (No output)
However, modifying a list contained in an environment *does* result in
a copy -- tracemem prints out some info when we do the assignment:
e <- new.env(parent = emptyenv())
e$x <- list(1)
tracemem(e$x)
# [1] "<0x1148c1708>"
e$x[[1]] <- 2
# tracemem[0x1148c1708 -> 0x11b2fc1b8]:
Currently e$x marks values as immutable if they have any references by
setting NAMED to 2. You can see this with
e <- new.env(parent = emptyenv())
e$x <- list(1)
.Internal(inspect(e))
@30b2498 04 ENVSXP g0c0 [NAM(1)] <0x30b2498>
ENCLOS:
@2600e98 04 ENVSXP g0c0 [MARK,NAM(2)] <R_EmptyEnv>
HASHTAB:
@2e41540 19 VECSXP g0c7 [] (len=29, tl=1)
@25c9628 00 NILSXP g0c0 [MARK,NAM(2)]
@25c9628 00 NILSXP g0c0 [MARK,NAM(2)]
@25c9628 00 NILSXP g0c0 [MARK,NAM(2)]
@25c9628 00 NILSXP g0c0 [MARK,NAM(2)]
@30b3370 02 LISTSXP g0c0 []
TAG: @2637870 01 SYMSXP g0c0 [MARK,NAM(2)] "x"
@3569488 19 VECSXP g0c1 [NAM(1)] (len=1, tl=0) ## <--- NAM = 1
@35694e8 14 REALSXP g0c1 [NAM(2)] (len=1, tl=0) 1
...
e$x
[[1]]
[1] 1
.Internal(inspect(e))
@30b2498 04 ENVSXP g0c0 [NAM(1)] <0x30b2498>
ENCLOS:
@2600e98 04 ENVSXP g0c0 [MARK,NAM(2)] <R_EmptyEnv>
HASHTAB:
@2e41540 19 VECSXP g0c7 [] (len=29, tl=1)
@25c9628 00 NILSXP g0c0 [MARK,NAM(2)]
@25c9628 00 NILSXP g0c0 [MARK,NAM(2)]
@25c9628 00 NILSXP g0c0 [MARK,NAM(2)]
@25c9628 00 NILSXP g0c0 [MARK,NAM(2)]
@30b3370 02 LISTSXP g0c0 []
TAG: @2637870 01 SYMSXP g0c0 [MARK,NAM(2)] "x"
@3569488 19 VECSXP g0c1 [NAM(2)] (len=1, tl=0) ## <--- NAM = 2
@35694e8 14 REALSXP g0c1 [NAM(2)] (len=1, tl=0) 1
...
It is not clear if this is needed or just done in an abundance of
caution. If R is built to use reference counting for determining
sharing information this does not happen, so this is likely to change
and not force a copy by 3.4.0.
Best,
luke
This is surprising to me. Why is a copy made in this case? It also
results in slower performance for these situations.
The most that I've been able to figure out is that it probably has
something to do with how the $ operator works with environments (but
not with lists). If you do the same operations without the $ operator,
by evaluating code in environment e, then no copy is made:
e <- new.env(parent = globalenv())
eval(quote({
x <- list(1)
tracemem(x)
x[[1]] <- 2
}), envir = e)
# (No output)
I'd appreciate it if someone could shed light on this. And if it's a
bug, that would be good to know too.
-Winston
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa Phone: 319-335-3386
Department of Statistics and Fax: 319-335-3017
Actuarial Science
241 Schaeffer Hall email: [email protected]
Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel