On Thu, 24 Nov 2011, Simon Urbanek wrote:
On Nov 24, 2011, at 8:05 AM, Matthew Dowle wrote:
On Nov 24, 2011, at 12:34 , Matthew Dowle wrote:
On Nov 24, 2011, at 11:13 , Matthew Dowle wrote:
Hi,
I expected NAMED to be 1 in all these three cases. It is for one of
them,
but not the other two?
R --vanilla
R version 2.14.0 (2011-10-31)
Platform: i386-pc-mingw32/i386 (32-bit)
x = 1L
.Internal(inspect(x)) # why NAM(2)? expected NAM(1)
@2514aa0 13 INTSXP g0c1 [NAM(2)] (len=1, tl=0) 1
y = 1:10
.Internal(inspect(y)) # NAM(1) as expected but why different to x?
@272f788 13 INTSXP g0c4 [NAM(1)] (len=10, tl=0) 1,2,3,4,5,...
z = data.frame()
.Internal(inspect(z)) # why NAM(2)? expected NAM(1)
@24fc28c 19 VECSXP g0c0 [OBJ,NAM(2),ATT] (len=0, tl=0)
ATTRIB:
@24fc270 02 LISTSXP g0c0 []
TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
@24fc334 16 STRSXP g0c0 [] (len=0, tl=0)
TAG: @3f2040 01 SYMSXP g0c0 [MARK,gp=0x4000] "row.names"
@24fc318 13 INTSXP g0c0 [] (len=0, tl=0)
TAG: @3f2388 01 SYMSXP g0c0 [MARK,gp=0x4000] "class"
@25be500 16 STRSXP g0c1 [] (len=1, tl=0)
@1d38af0 09 CHARSXP g0c2 [MARK,gp=0x21,ATT] "data.frame"
It's a little difficult to search for the word "named" but I tried and
found this in R-ints :
"Note that optimizing NAMED = 1 is only effective within a primitive
(as the closure wrapper of a .Internal will set NAMED = 2 when the
promise to the argument is evaluated)"
So might it be that just looking at NAMED using .Internal(inspect())
is
setting NAMED=2? But if so, why does y have NAMED==1?
This is tricky business... I'm not quite sure I'll get it right, but
let's
try
When you are assigning a constant, the value you assign is already part
of
the assignment expression, so if you want to modify it, you must
duplicate. So NAMED==2 on z <- 1 is basically to prevent you from
accidentally "changing the value of 1". If it weren't, then you could
get
bitten by code like for(i in 1:2) {z <- 1; if(i==1) z[1] <- 2}.
If you're assigning the result of a computation, then the object only
exists once, so
z <- 0+1 gets NAMED==1.
However, if the computation is done by returning a named value from
within
a function, as in
f <- function(){v <- 1+0; v}
z <- f()
then again NAMED==2. This is because the side effects of the function
_might_ result in something having a hold on the function environment,
e.g. if we had
e <- NULL
f <- function(){e <<-environment(); v <- 1+0; v}
z <- f()
then z[1] <- 5 would change e$v too. As it happens, there aren't any
side
effects in the forme case, but R loses track and assumes the worst.
Thanks a lot, think I follow. That explains x vs y, but why is z
NAMED==2?
The result of data.frame() is an object that exists once (similar to
1:10)
so shouldn't it be NAMED==1 too? Or, R loses track and assumes the
worst
even on its own functions such as data.frame()?
R loses track. I suspect that is really all it can do without actual
reference counting. The function data.frame is more than 150 lines of
code, and if any of those end up invoking user code, possibly via a class
method, you can't tell definitively whether or not the evaluation
environment dies at the return.
Ohhh, think I see now. After Duncan's reply I was going to ask if it was
possible to change data.frame() to be primitive so it could set NAMED=1.
But it seems primitive functions can't use R code so data.frame() would
need to be ported to C. Ok! - not quick or easy, and not without
consideable risk. And, data.frame() can invoke user code inside it anyway
then.
Maybe some review of the 'R Internals' manual about what a primitive
function is would be desirable. Converting such a function to C would
ossify it, which is the major reason it has not been done (it has been
contemplated).
Since list() is primitive I tried to construct a data.frame starting with
list() [since structure() isn't primitive], but then merely adding an
attribute seems to set NAMED==2 too ?
Yes, because attr(x,y) <- z is the same as
`*tmp*` <- x
x <- `attr<-`(`*tmp*`, y, z)
rm(`*tmp*`)
Only if it were an interpreted function.
so there are two references to the data frame: one in DF and one in
`*tmp*`. It is the first line that causes the NAMED bump. And, yes,
it's real:
`f<-`=function(x,value) { print(ls(parent.frame())); x<-value }
x=1
f(x)=1
[1] "*tmp*" "f<-" "x"
You have just explained why interpreted replacement functions set
NAMED=2, but this does not apply to primitives.
To help convince you, consider
d <- 1:2
attributes(d) <- list(x=13)
d
[1] 1 2
attr(,"x")
[1] 13
.Internal(inspect(d))
@11be748 13 INTSXP g0c1 [NAM(1),ATT] (len=2, tl=0) 1,2
ATTRIB:
@1552054 02 LISTSXP g0c0 []
TAG: @102b1c0 01 SYMSXP g0c0 [MARK,NAM(2)] "x"
@11be768 14 REALSXP g0c1 [] (len=1, tl=0) 13
Now, as to why attr<- (which is primitive) does what it does you will
need to read (and understand) the code.
You could skip that by using the function directly (I don't think it's
recommended, though):
.Internal(inspect(l <- list(a=1)))
@1028c82f8 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0)
@1028c8268 14 REALSXP g0c1 [] (len=1, tl=0) 1
ATTRIB:
@100b6e748 02 LISTSXP g0c0 []
TAG: @100843878 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
@1028c82c8 16 STRSXP g0c1 [] (len=1, tl=0)
@1009cd388 09 CHARSXP g0c1 [MARK,gp=0x21] "a"
.Internal(inspect(`names<-`(l, "b")))
@1028c82f8 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0)
@1028c8268 14 REALSXP g0c1 [] (len=1, tl=0) 1
ATTRIB:
@100b6e748 02 LISTSXP g0c0 []
TAG: @100843878 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
@1028c8178 16 STRSXP g0c1 [NAM(1)] (len=1, tl=0)
@100967af8 09 CHARSXP g0c1 [MARK,gp=0x20] "b"
.Internal(inspect(l))
@1028c82f8 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0)
@1028c8268 14 REALSXP g0c1 [] (len=1, tl=0) 1
ATTRIB:
@100b6e748 02 LISTSXP g0c0 []
TAG: @100843878 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
@1028c8178 16 STRSXP g0c1 [NAM(1)] (len=1, tl=0)
@100967af8 09 CHARSXP g0c1 [MARK,gp=0x20] "b"
Cheers,
Simon
DF = list(a=1:3,b=4:6)
.Internal(inspect(DF)) # so far so good: NAM(1)
@25149e0 19 VECSXP g0c1 [NAM(1),ATT] (len=2, tl=0)
@263ea50 13 INTSXP g0c2 [] (len=3, tl=0) 1,2,3
@263eaa0 13 INTSXP g0c2 [] (len=3, tl=0) 4,5,6
ATTRIB:
@2457984 02 LISTSXP g0c0 []
TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
@25149c0 16 STRSXP g0c1 [] (len=2, tl=0)
@1e987d8 09 CHARSXP g0c1 [MARK,gp=0x21] "a"
@1e56948 09 CHARSXP g0c1 [MARK,gp=0x21] "b"
attr(DF,"foo") <- "bar" # just adding an attribute sets NAM(2) ?
.Internal(inspect(DF))
@25149e0 19 VECSXP g0c1 [NAM(2),ATT] (len=2, tl=0)
@263ea50 13 INTSXP g0c2 [] (len=3, tl=0) 1,2,3
@263eaa0 13 INTSXP g0c2 [] (len=3, tl=0) 4,5,6
ATTRIB:
@2457984 02 LISTSXP g0c0 []
TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names"
@25149c0 16 STRSXP g0c1 [] (len=2, tl=0)
@1e987d8 09 CHARSXP g0c1 [MARK,gp=0x21] "a"
@1e56948 09 CHARSXP g0c1 [MARK,gp=0x21] "b"
TAG: @245732c 01 SYMSXP g0c0 [] "foo"
@25148a0 16 STRSXP g0c1 [NAM(1)] (len=1, tl=0)
@2514920 09 CHARSXP g0c1 [gp=0x20] "bar"
Matthew
--
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd....@cbs.dk Priv: pda...@gmail.com
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
--
Brian D. Ripley, rip...@stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel