Dear All,

is it known that source works much faster in  R 2.15.2 than in R 3.0.2 ?
In the example below I observe e.g. for a data.frame with 10^7 rows the following timings:

R version 2.15.2 Patched (2012-11-29 r61184)
length: 1e+07
   user  system elapsed
  62.04    0.22   62.26

R version 3.0.2 Patched (2013-10-27 r64116)
length: 1e+07
   user  system elapsed
 388.63  176.42  566.41

Is there a way to speed R version 3.0.2 up to the performance of R version 2.15.2?

best regards,

Heinz Tüchler


example:
sessionInfo()
sample.vec <-
  c('source', 'causes', 'R', 'to', 'accept', 'its', 'input', 'from', 'the',
    'named', 'file', 'or', 'URL', 'or', 'connection')
dmp.size <- c(10^(1:7))
set.seed(37)

for(i in dmp.size) {
  df0 <- data.frame(x=sample(sample.vec, i, replace=TRUE))
  dump('df0', file='testdump')
  cat('length:', i, '\n')
  print(system.time(source('testdump', keep.source = FALSE,
                           encoding='')))
}

output for R version 2.15.2 Patched (2012-11-29 r61184):
sessionInfo()
R version 2.15.2 Patched (2012-11-29 r61184)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=German_Switzerland.1252  LC_CTYPE=German_Switzerland.1252
[3] LC_MONETARY=German_Switzerland.1252 LC_NUMERIC=C
[5] LC_TIME=German_Switzerland.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base
sample.vec <-
+ c('source', 'causes', 'R', 'to', 'accept', 'its', 'input', 'from', 'the',
+     'named', 'file', 'or', 'URL', 'or', 'connection')
dmp.size <- c(10^(1:7))
set.seed(37)

for(i in dmp.size) {
+   df0 <- data.frame(x=sample(sample.vec, i, replace=TRUE))
+   dump('df0', file='testdump')
+   cat('length:', i, '\n')
+   print(system.time(source('testdump', keep.source = FALSE,
+                            encoding='')))
+ }
length: 10
   user  system elapsed
      0       0       0
length: 100
   user  system elapsed
      0       0       0
length: 1000
   user  system elapsed
      0       0       0
length: 10000
   user  system elapsed
   0.02    0.00    0.01
length: 1e+05
   user  system elapsed
   0.21    0.00    0.20
length: 1e+06
   user  system elapsed
   4.47    0.04    4.51
length: 1e+07
   user  system elapsed
  62.04    0.22   62.26



output for R version 3.0.2 Patched (2013-10-27 r64116):
sessionInfo()
R version 3.0.2 Patched (2013-10-27 r64116)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=German_Switzerland.1252  LC_CTYPE=German_Switzerland.1252
[3] LC_MONETARY=German_Switzerland.1252 LC_NUMERIC=C
[5] LC_TIME=German_Switzerland.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base
sample.vec <-
+ c('source', 'causes', 'R', 'to', 'accept', 'its', 'input', 'from', 'the',
+     'named', 'file', 'or', 'URL', 'or', 'connection')
dmp.size <- c(10^(1:7))
set.seed(37)

for(i in dmp.size) {
+   df0 <- data.frame(x=sample(sample.vec, i, replace=TRUE))
+   dump('df0', file='testdump')
+   cat('length:', i, '\n')
+   print(system.time(source('testdump', keep.source = FALSE,
+                            encoding='')))
+ }
length: 10
   user  system elapsed
      0       0       0
length: 100
   user  system elapsed
      0       0       0
length: 1000
   user  system elapsed
      0       0       0
length: 10000
   user  system elapsed
   0.01    0.00    0.01
length: 1e+05
   user  system elapsed
   0.36    0.06    0.42
length: 1e+06
   user  system elapsed
   6.02    1.86    7.88
length: 1e+07
   user  system elapsed
 388.63  176.42  566.41




--
Heinz Tüchler +4317146261 / +436605653878

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to