The difference is in indexing by row number vs. indexing by row name. It has long been known that names slow matricies down, some routines make a copy of dimnames of a matrix, remove the dimnames, do the computations with the matrix, then put the dimnames back on. This can speed things up quite a bit in some circumstances. For your example, indexing by number means jumping to a specific offset in the matrix, indexing by name means searching through all the names and doing string comparisons to find the match. A 300 fold difference in speed is not suprising.
-- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 > -----Original Message----- > From: Herve Pages [mailto:[EMAIL PROTECTED] > Sent: Friday, March 02, 2007 7:04 PM > To: Greg Snow > Cc: r-devel@r-project.org > Subject: Re: [Rd] extracting rows from a data frame by > looping over the row names: performance issues > > Hi Greg, > > Greg Snow wrote: > > Your 2 examples have 2 differences and they are therefore > confounded > > in their effects. > > > > What are your results for: > > > > system.time(for (i in 1:100) {row <- dat[i, ] }) > > > > > > > > Right. What you suggest is even faster (and more simple): > > > mat <- matrix(rep(paste(letters, collapse=""), 5*300000), ncol=5) > > dat <- as.data.frame(mat) > > > system.time(for (key in row.names(dat)[1:100]) { row <- > dat[key, ] }) > user system elapsed > 13.241 0.460 13.702 > > > system.time(for (i in 1:100) { row <- sapply(dat, > function(col) col[i]) }) > user system elapsed > 0.280 0.372 0.650 > > > system.time(for (i in 1:100) {row <- dat[i, ] }) > user system elapsed > 0.044 0.088 0.130 > > So apparently here extracting with dat[i, ] is 300 times > faster than extracting with dat[key, ] ! > > > system.time(for (i in 1:100) dat["1", ]) > user system elapsed > 12.680 0.396 13.075 > > > system.time(for (i in 1:100) dat[1, ]) > user system elapsed > 0.060 0.076 0.137 > > Good to know! > > Thanks a lot, > H. > > > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel