There was nothing attached in the copy that came through to me. By the way, there was some discussion earlier this year on a light-weight data.frame class but I don't think anyone ever posted any code.
On 12/9/05, Matthew Dowle <[EMAIL PROTECTED]> wrote: > > Hi, > > Please see below for post on r-help regarding data.frame() and the > possibility of dropping rownames, for space and time reasons. > I've made some changes, attached, and it seems to be working well. I see the > expected space (90% saved) and time (10 times faster) savings. There are no > doubt some bugs, and needs more work and testing, but I thought I would post > first at this stage. > > Could some changes along these lines be made to R ? I'm happy to help with > testing and further work if required. In the meantime I can work with > overloaded functions which fixes the problems in my case. > > Functions effected : > > dim.data.frame > format.data.frame > print.data.frame > data.frame > [.data.frame > as.matrix.data.frame > > Modified source code attached. > > Regards, > Matthew > > > -----Original Message----- > From: Matthew Dowle > Sent: 09 December 2005 09:44 > To: 'Peter Dalgaard' > Cc: 'r-help@stat.math.ethz.ch' > Subject: RE: [R] data.frame() size > > > > That explains it. Thanks. I don't need rownames though, as I'll only ever > use integer subscripts. Is there anyway to drop them, or even better not > create them in the first place? The memory saved (90%) by not having them > and 10 times speed up would be very useful. I think I need a data.frame > rather than a matrix because I have columns of different types in real life. > > > rownames(d) = NULL > Error in "dimnames<-.data.frame"(`*tmp*`, value = list(NULL, c("a", "b" : > invalid 'dimnames' given for data frame > > > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter > Dalgaard > Sent: 08 December 2005 18:57 > To: Matthew Dowle > Cc: 'r-help@stat.math.ethz.ch' > Subject: Re: [R] data.frame() size > > > Matthew Dowle <[EMAIL PROTECTED]> writes: > > > Hi, > > > > In the example below why is d 10 times bigger than m, according to > > object.size ? It also takes around 10 times as long to create, which > > fits with object.size() being truthful. gcinfo(TRUE) also indicates a > > great deal more garbage collector activity caused by data.frame() than > > matrix(). > > > > $ R --vanilla > > .... > > > nr = 1000000 > > > system.time(m<<-matrix(integer(1), nrow=nr, ncol=2)) > > [1] 0.22 0.01 0.23 0.00 0.00 > > > system.time(d<<-data.frame(a=integer(nr), b=integer(nr))) > > [1] 2.81 0.20 3.01 0.00 0.00 # 10 times longer > > > > > dim(m) > > [1] 1000000 2 > > > dim(d) > > [1] 1000000 2 # same dimensions > > > > > storage.mode(m) > > [1] "integer" > > > sapply(d, storage.mode) > > a b > > "integer" "integer" # same storage.mode > > > > > object.size(m)/1024^2 > > [1] 7.629616 > > > object.size(d)/1024^2 > > [1] 76.29482 # but 10 times bigger > > > > > sum(sapply(d, object.size))/1024^2 > > [1] 7.629501 # or is it ? If its not > > really 10 times bigger, why 10 times longer above ? > > Row names!! > > > > r <- as.character(1:1e6) > > object.size(r) > [1] 72000056 > > object.size(r)/1024^2 > [1] 68.6646 > > 'nuff said? > > -- > O__ ---- Peter Dalgaard Ă˜ster Farimagsgade 5, Entr.B > c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K > (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 > ~~~~~~~~~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 > > > > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > > > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel