All was run on the identical machine in independent sessions. I did not restart Windows. I also tried 32bit R 3.0.2 and it seemed slightly faster than 64bit. Using Process Explorer v15.23 (http://technet.microsoft.com/de-de/sysinternals/bb896653) my impression was that R 3.0.2 manages memory in a different way than R 2.15.2. While in R 2.15.2 the physical memory used grows steadily, when sourcing a big file, in R 3.0.2 growth and shrinking cycle.

best,
Heinz

on/am 30.10.2013 13:28, Carl Witthoft wrote/hat geschrieben:
Did you run the identical code on the identical machine, and did you verify
there were no other tasks running which might have limited the RAM available
to R?  And equally important, did you run these tests in the reverse order
(in case R was storing large objects from the first run, thus chewing up
RAM)?



Dear All,

is it known that source works much faster in  R 2.15.2 than in R 3.0.2 ?
In the example below I observe e.g. for a data.frame with 10^7 rows the
following timings:

R version 2.15.2 Patched (2012-11-29 r61184)
length: 1e+07
     user  system elapsed
    62.04    0.22   62.26

R version 3.0.2 Patched (2013-10-27 r64116)
length: 1e+07
     user  system elapsed
   388.63  176.42  566.41

Is there a way to speed R version 3.0.2 up to the performance of R
version 2.15.2?

best regards,

Heinz Tüchler


example:
sessionInfo()
sample.vec <-
    c('source', 'causes', 'R', 'to', 'accept', 'its', 'input', 'from', 'the',
      'named', 'file', 'or', 'URL', 'or', 'connection')
dmp.size <- c(10^(1:7))
set.seed(37)

for(i in dmp.size) {
    df0 <- data.frame(x=sample(sample.vec, i, replace=TRUE))
    dump('df0', file='testdump')
    cat('length:', i, '\n')
    print(system.time(source('testdump', keep.source = FALSE,
                             encoding='')))
}

output for R version 2.15.2 Patched (2012-11-29 r61184):
sessionInfo()
R version 2.15.2 Patched (2012-11-29 r61184)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=German_Switzerland.1252  LC_CTYPE=German_Switzerland.1252
[3] LC_MONETARY=German_Switzerland.1252 LC_NUMERIC=C
[5] LC_TIME=German_Switzerland.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base
sample.vec <-
+   c('source', 'causes', 'R', 'to', 'accept', 'its', 'input', 'from',
'the',
+     'named', 'file', 'or', 'URL', 'or', 'connection')
dmp.size <- c(10^(1:7))
set.seed(37)

for(i in dmp.size) {
+   df0 <- data.frame(x=sample(sample.vec, i, replace=TRUE))
+   dump('df0', file='testdump')
+   cat('length:', i, '\n')
+   print(system.time(source('testdump', keep.source = FALSE,
+                            encoding='')))
+ }
length: 10
     user  system elapsed
        0       0       0
length: 100
     user  system elapsed
        0       0       0
length: 1000
     user  system elapsed
        0       0       0
length: 10000
     user  system elapsed
     0.02    0.00    0.01
length: 1e+05
     user  system elapsed
     0.21    0.00    0.20
length: 1e+06
     user  system elapsed
     4.47    0.04    4.51
length: 1e+07
     user  system elapsed
    62.04    0.22   62.26



output for R version 3.0.2 Patched (2013-10-27 r64116):
sessionInfo()
R version 3.0.2 Patched (2013-10-27 r64116)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=German_Switzerland.1252  LC_CTYPE=German_Switzerland.1252
[3] LC_MONETARY=German_Switzerland.1252 LC_NUMERIC=C
[5] LC_TIME=German_Switzerland.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base
sample.vec <-
+   c('source', 'causes', 'R', 'to', 'accept', 'its', 'input', 'from',
'the',
+     'named', 'file', 'or', 'URL', 'or', 'connection')
dmp.size <- c(10^(1:7))
set.seed(37)

for(i in dmp.size) {
+   df0 <- data.frame(x=sample(sample.vec, i, replace=TRUE))
+   dump('df0', file='testdump')
+   cat('length:', i, '\n')
+   print(system.time(source('testdump', keep.source = FALSE,
+                            encoding='')))
+ }
length: 10
     user  system elapsed
        0       0       0
length: 100
     user  system elapsed
        0       0       0
length: 1000
     user  system elapsed
        0       0       0
length: 10000
     user  system elapsed
     0.01    0.00    0.01
length: 1e+05
     user  system elapsed
     0.36    0.06    0.42
length: 1e+06
     user  system elapsed
     6.02    1.86    7.88
length: 1e+07
     user  system elapsed
   388.63  176.42  566.41






--
View this message in context: 
http://r.789695.n4.nabble.com/big-speed-difference-in-source-btw-R-2-15-2-and-R-3-0-2-tp4679314p4679346.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to