On Nov 24, 2011, at 1:48 PM, Prof Brian Ripley wrote: > On Thu, 24 Nov 2011, Simon Urbanek wrote: > >> >> On Nov 24, 2011, at 8:05 AM, Matthew Dowle wrote: >> >>>> >>>> On Nov 24, 2011, at 12:34 , Matthew Dowle wrote: >>>> >>>>>> >>>>>> On Nov 24, 2011, at 11:13 , Matthew Dowle wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I expected NAMED to be 1 in all these three cases. It is for one of >>>>>>> them, >>>>>>> but not the other two? >>>>>>> >>>>>>>> R --vanilla >>>>>>> R version 2.14.0 (2011-10-31) >>>>>>> Platform: i386-pc-mingw32/i386 (32-bit) >>>>>>> >>>>>>>> x = 1L >>>>>>>> .Internal(inspect(x)) # why NAM(2)? expected NAM(1) >>>>>>> @2514aa0 13 INTSXP g0c1 [NAM(2)] (len=1, tl=0) 1 >>>>>>> >>>>>>>> y = 1:10 >>>>>>>> .Internal(inspect(y)) # NAM(1) as expected but why different to x? >>>>>>> @272f788 13 INTSXP g0c4 [NAM(1)] (len=10, tl=0) 1,2,3,4,5,... >>>>>>> >>>>>>>> z = data.frame() >>>>>>>> .Internal(inspect(z)) # why NAM(2)? expected NAM(1) >>>>>>> @24fc28c 19 VECSXP g0c0 [OBJ,NAM(2),ATT] (len=0, tl=0) >>>>>>> ATTRIB: >>>>>>> @24fc270 02 LISTSXP g0c0 [] >>>>>>> TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names" >>>>>>> @24fc334 16 STRSXP g0c0 [] (len=0, tl=0) >>>>>>> TAG: @3f2040 01 SYMSXP g0c0 [MARK,gp=0x4000] "row.names" >>>>>>> @24fc318 13 INTSXP g0c0 [] (len=0, tl=0) >>>>>>> TAG: @3f2388 01 SYMSXP g0c0 [MARK,gp=0x4000] "class" >>>>>>> @25be500 16 STRSXP g0c1 [] (len=1, tl=0) >>>>>>> @1d38af0 09 CHARSXP g0c2 [MARK,gp=0x21,ATT] "data.frame" >>>>>>> >>>>>>> It's a little difficult to search for the word "named" but I tried and >>>>>>> found this in R-ints : >>>>>>> >>>>>>> "Note that optimizing NAMED = 1 is only effective within a primitive >>>>>>> (as the closure wrapper of a .Internal will set NAMED = 2 when the >>>>>>> promise to the argument is evaluated)" >>>>>>> >>>>>>> So might it be that just looking at NAMED using .Internal(inspect()) >>>>>>> is >>>>>>> setting NAMED=2? But if so, why does y have NAMED==1? >>>>>> >>>>>> This is tricky business... I'm not quite sure I'll get it right, but >>>>>> let's >>>>>> try >>>>>> >>>>>> When you are assigning a constant, the value you assign is already part >>>>>> of >>>>>> the assignment expression, so if you want to modify it, you must >>>>>> duplicate. So NAMED==2 on z <- 1 is basically to prevent you from >>>>>> accidentally "changing the value of 1". If it weren't, then you could >>>>>> get >>>>>> bitten by code like for(i in 1:2) {z <- 1; if(i==1) z[1] <- 2}. >>>>>> >>>>>> If you're assigning the result of a computation, then the object only >>>>>> exists once, so >>>>>> z <- 0+1 gets NAMED==1. >>>>>> >>>>>> However, if the computation is done by returning a named value from >>>>>> within >>>>>> a function, as in >>>>>> >>>>>>> f <- function(){v <- 1+0; v} >>>>>>> z <- f() >>>>>> >>>>>> then again NAMED==2. This is because the side effects of the function >>>>>> _might_ result in something having a hold on the function environment, >>>>>> e.g. if we had >>>>>> >>>>>> e <- NULL >>>>>> f <- function(){e <<-environment(); v <- 1+0; v} >>>>>> z <- f() >>>>>> >>>>>> then z[1] <- 5 would change e$v too. As it happens, there aren't any >>>>>> side >>>>>> effects in the forme case, but R loses track and assumes the worst. >>>>>> >>>>> >>>>> Thanks a lot, think I follow. That explains x vs y, but why is z >>>>> NAMED==2? >>>>> The result of data.frame() is an object that exists once (similar to >>>>> 1:10) >>>>> so shouldn't it be NAMED==1 too? Or, R loses track and assumes the >>>>> worst >>>>> even on its own functions such as data.frame()? >>>> >>>> R loses track. I suspect that is really all it can do without actual >>>> reference counting. The function data.frame is more than 150 lines of >>>> code, and if any of those end up invoking user code, possibly via a class >>>> method, you can't tell definitively whether or not the evaluation >>>> environment dies at the return. >>> >>> Ohhh, think I see now. After Duncan's reply I was going to ask if it was >>> possible to change data.frame() to be primitive so it could set NAMED=1. >>> But it seems primitive functions can't use R code so data.frame() would >>> need to be ported to C. Ok! - not quick or easy, and not without >>> consideable risk. And, data.frame() can invoke user code inside it anyway >>> then. > > Maybe some review of the 'R Internals' manual about what a primitive function > is would be desirable. Converting such a function to C would ossify it, > which is the major reason it has not been done (it has been contemplated). > >>> Since list() is primitive I tried to construct a data.frame starting with >>> list() [since structure() isn't primitive], but then merely adding an >>> attribute seems to set NAMED==2 too ? >>> >> >> Yes, because attr(x,y) <- z is the same as >> >> `*tmp*` <- x >> x <- `attr<-`(`*tmp*`, y, z) >> rm(`*tmp*`) > > Only if it were an interpreted function. > >> so there are two references to the data frame: one in DF and one in `*tmp*`. >> It is the first line that causes the NAMED bump. And, yes, it's real: >> >>> `f<-`=function(x,value) { print(ls(parent.frame())); x<-value } >>> x=1 >>> f(x)=1 >> [1] "*tmp*" "f<-" "x" > > You have just explained why interpreted replacement functions set NAMED=2, > but this does not apply to primitives. >
It does - see eval.c l1680-2 which causes it to go through do_set which is turn bumps NAMED. I have responded only to Luke but I guess I should have included everyone.. > To help convince you, consider > >> d <- 1:2 >> attributes(d) <- list(x=13) >> d > [1] 1 2 > attr(,"x") > [1] 13 >> .Internal(inspect(d)) > @11be748 13 INTSXP g0c1 [NAM(1),ATT] (len=2, tl=0) 1,2 > ATTRIB: > @1552054 02 LISTSXP g0c0 [] > TAG: @102b1c0 01 SYMSXP g0c0 [MARK,NAM(2)] "x" > @11be768 14 REALSXP g0c1 [] (len=1, tl=0) 13 > > Now, as to why attr<- (which is primitive) does what it does you will need to > read (and understand) the code. > Because do_attributesgets duplicates (attrib.c l1178) which you can easily see: > d <- 1:2 > .Internal(inspect(d)) @155aba8 13 INTSXP g0c1 [NAM(1)] (len=2, tl=0) 1,2 > attributes(d) <- list(x=13) > .Internal(inspect(d)) @15dbe28 13 INTSXP g0c1 [NAM(1),ATT] (len=2, tl=0) 1,2 ATTRIB: @16da5a8 02 LISTSXP g0c0 [] TAG: @660008 01 SYMSXP g0c0 [MARK,NAM(2)] "x" @15dbe58 14 REALSXP g0c1 [] (len=1, tl=0) 13 Note the different pointer of the value of d now -- do_attributesgets returns a duplicate with NAMED=0 so do_set assignment bumps it to 1. Cheers, Simon >> >> You could skip that by using the function directly (I don't think it's >> recommended, though): >> >>> .Internal(inspect(l <- list(a=1))) >> @1028c82f8 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0) >> @1028c8268 14 REALSXP g0c1 [] (len=1, tl=0) 1 >> ATTRIB: >> @100b6e748 02 LISTSXP g0c0 [] >> TAG: @100843878 01 SYMSXP g0c0 [MARK,gp=0x4000] "names" >> @1028c82c8 16 STRSXP g0c1 [] (len=1, tl=0) >> @1009cd388 09 CHARSXP g0c1 [MARK,gp=0x21] "a" >>> .Internal(inspect(`names<-`(l, "b"))) >> @1028c82f8 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0) >> @1028c8268 14 REALSXP g0c1 [] (len=1, tl=0) 1 >> ATTRIB: >> @100b6e748 02 LISTSXP g0c0 [] >> TAG: @100843878 01 SYMSXP g0c0 [MARK,gp=0x4000] "names" >> @1028c8178 16 STRSXP g0c1 [NAM(1)] (len=1, tl=0) >> @100967af8 09 CHARSXP g0c1 [MARK,gp=0x20] "b" >>> .Internal(inspect(l)) >> @1028c82f8 19 VECSXP g0c1 [NAM(1),ATT] (len=1, tl=0) >> @1028c8268 14 REALSXP g0c1 [] (len=1, tl=0) 1 >> ATTRIB: >> @100b6e748 02 LISTSXP g0c0 [] >> TAG: @100843878 01 SYMSXP g0c0 [MARK,gp=0x4000] "names" >> @1028c8178 16 STRSXP g0c1 [NAM(1)] (len=1, tl=0) >> @100967af8 09 CHARSXP g0c1 [MARK,gp=0x20] "b" >> >> Cheers, >> Simon >> >> >> >>>> DF = list(a=1:3,b=4:6) >>>> .Internal(inspect(DF)) # so far so good: NAM(1) >>> @25149e0 19 VECSXP g0c1 [NAM(1),ATT] (len=2, tl=0) >>> @263ea50 13 INTSXP g0c2 [] (len=3, tl=0) 1,2,3 >>> @263eaa0 13 INTSXP g0c2 [] (len=3, tl=0) 4,5,6 >>> ATTRIB: >>> @2457984 02 LISTSXP g0c0 [] >>> TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names" >>> @25149c0 16 STRSXP g0c1 [] (len=2, tl=0) >>> @1e987d8 09 CHARSXP g0c1 [MARK,gp=0x21] "a" >>> @1e56948 09 CHARSXP g0c1 [MARK,gp=0x21] "b" >>>> >>>> attr(DF,"foo") <- "bar" # just adding an attribute sets NAM(2) ? >>>> .Internal(inspect(DF)) >>> @25149e0 19 VECSXP g0c1 [NAM(2),ATT] (len=2, tl=0) >>> @263ea50 13 INTSXP g0c2 [] (len=3, tl=0) 1,2,3 >>> @263eaa0 13 INTSXP g0c2 [] (len=3, tl=0) 4,5,6 >>> ATTRIB: >>> @2457984 02 LISTSXP g0c0 [] >>> TAG: @3f2120 01 SYMSXP g0c0 [MARK,gp=0x4000] "names" >>> @25149c0 16 STRSXP g0c1 [] (len=2, tl=0) >>> @1e987d8 09 CHARSXP g0c1 [MARK,gp=0x21] "a" >>> @1e56948 09 CHARSXP g0c1 [MARK,gp=0x21] "b" >>> TAG: @245732c 01 SYMSXP g0c0 [] "foo" >>> @25148a0 16 STRSXP g0c1 [NAM(1)] (len=1, tl=0) >>> @2514920 09 CHARSXP g0c1 [gp=0x20] "bar" >>> >>> >>> Matthew >>> >>> >>>> -- >>>> Peter Dalgaard, Professor >>>> Center for Statistics, Copenhagen Business School >>>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark >>>> Phone: (+45)38153501 >>>> Email: pd....@cbs.dk Priv: pda...@gmail.com >>>> >>>> >>> >>> ______________________________________________ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> >>> >> >> ______________________________________________ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > > -- > Brian D. Ripley, rip...@stats.ox.ac.uk > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 > > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel