Suharto,
If you're interested in performance with subscripting, you might want
to look at pqR (pqR-project.org). It has some substantial performance
improvements for subscripting over R Core versions. This is
especially true for the current development version of pqR (probably
leading to a new release in about a month).
You can look at a somewhat-stable snapshot of recent pqR development at
https://github.com/radfordneal/pqR/tree/05e32fa6
In particular, src/main/subscript.c might be of interest.
Note that you should read mods-dir/README if you want to build this,
and in particular, you need to run create-configure in the top-level
source directory first.
I modified your tests a bit, including producing versions using both
vectors of length 1e8 like you did (which will not fit in cache) and
vectors of length 1e5 (which will fit in at least the L3 cache). I
ran tests on an Intel Skylake processor (E3-1270v5 @ 3.6GHz), using
gcc 7.2 with -O3 -march=native -mtune=native.
I got the following results with R-3.4.2 (with R_ENABLE_JIT=0, which
is slightly faster than using the JIT compiler):
R-3.4.2, LARGE VECTORS:
> N <- 1e8; R <- 5
> #N <- 1e5; R <- 1000
>
> x <- numeric(N)
> i <- rep(FALSE, length(x))# no reycling
> system.time(for (r in 1:R) a <- x[i])
user system elapsed
0.296 0.000 0.297
> i <- FALSE# recycling
> system.time(for (r in 1:R) a <- x[i])
user system elapsed
0.416 0.000 0.418
>
> x <- numeric(N)
> i <- rep(TRUE, length(x))# no reycling
> system.time(for (r in 1:R) a <- x[i])
user system elapsed
1.416 0.352 1.773
> i <- TRUE# recycling
> system.time(for (r in 1:R) a <- x[i])
user system elapsed
1.348 0.264 1.613
>
> x <- numeric(N)
> system.time(for (r in 1:R) a <- x[-1])
user system elapsed
1.516 0.376 1.895
> system.time(for (r in 1:R) a <- x[2:length(x)])
user system elapsed
1.516 0.308 1.827
>
> v <- 2:length(x)
> system.time(for (r in 1:R) a <- x[v])
user system elapsed
1.416 0.268 1.689
R-3.4.2, SMALL VECTORS:
> #N <- 1e8; R <- 5
> N <- 1e5; R <- 1000
>
> x <- numeric(N)
> i <- rep(FALSE, length(x))# no reycling
> system.time(for (r in 1:R) a <- x[i])
user system elapsed
0.088 0.000 0.089
> i <- FALSE# recycling
> system.time(for (r in 1:R) a <- x[i])
user system elapsed
0.084 0.000 0.084
>
> x <- numeric(N)
> i <- rep(TRUE, length(x))# no reycling
> system.time(for (r in 1:R) a <- x[i])
user system elapsed
0.492 0.020 0.515
> i <- TRUE# recycling
> system.time(for (r in 1:R) a <- x[i])
user system elapsed
0.408 0.008 0.420
>
> x <- numeric(N)
> system.time(for (r in 1:R) a <- x[-1])
user system elapsed
0.508 0.004 0.516
> system.time(for (r in 1:R) a <- x[2:length(x)])
user system elapsed
0.464 0.008 0.473
>
> v <- 2:length(x)
> system.time(for (r in 1:R) a <- x[v])
user system elapsed
0.428 0.000 0.428
Here are the results with the development version of pqR (uncompressed
pointers, no byte compilation):
pqR (devel), LARGE VECTORS:
> N <- 1e8; R <- 5
> #N <- 1e5; R <- 1000
>
> x <- numeric(N)
> i <- rep(FALSE, length(x))# no reycling
> system.time(for (r in 1:R) a <- x[i])
user system elapsed
0.192 0.000 0.193
> i <- FALSE# recycling
> system.time(for (r in 1:R) a <- x[i])
user system elapsed
0.436 0.000 0.434
>
> x <- numeric(N)
> i <- rep(TRUE, length(x))# no reycling
> system.time(for (r in 1:R) a <- x[i])
user system elapsed
0.768 0.216 0.988
> i <- TRUE# recycling
> system.time(for (r in 1:R) a <- x[i])
user system elapsed
0.832 0.272 1.105
>
> x <- numeric(N)
> system.time(for (r in 1:R) a <- x[-1])
user system elapsed
0.280 0.156 0.435
> system.time(for (r in 1:R) a <- x[2:length(x)])
user system elapsed
0.252 0.184 0.436
>
> v <- 2:length(x)
> system.time(for (r in 1:R) a <- x[v])
user system elapsed
0.828 0.168 0.998
pqR (devel), SMALL VECTORS:
> #N <- 1e8; R <- 5
> N <- 1e5; R <- 1000
>
> x <- numeric(N)
> i <- rep(FALSE, length(x))# no reycling
> system.time(for (r in 1:R) a <- x[i])
user system elapsed
0.040 0.000 0.038
> i <- FALSE# recycling
> system.time(for (r in 1:R) a <- x[i])
user system elapsed
0.084 0.000 0.087
>
> x <- numeric(N)
> i <- rep(TRUE, length(x))# no reycling
> system.time(for (r in 1:R) a <- x[i])
user system elapsed
0.156 0.036 0.192
> i <- TRUE# recycling
> system.time(for (r in 1:R) a <- x[i])
user system elapsed
0.184 0.012 0.195
>
> x <- numeric(N)
> system.time(for (r in 1:R) a <- x[-1])
user system elapsed
0.060 0.012 0.075
> system.time(for (r in