[Rd] Rcpp too good to be true?

2011-12-13 Thread Jeffrey Pollock
Hello all,

I've been working on a package to do various things related to the
Conway-Maxwell-Poisson distribution and wanted to be able to make fast
random draws from the distribution. My R code was running quite slow so I
decided to give Rcpp a bash. I had used it before but only for extremely
basic stuff and always using inline. This time I decided to give making a
proper package a go.

First of all I should say that this was incredibly easy due to
Rcpp.package.skeleton() and the countless answers to quesions online and
documentation!

Secondly, I'm worried that my speedup has been so massive (over 500x !!!)
that I think I've made a mistake, hence my post here.

Here is all my code, if someone has a minute to point out anything wrong
(or even if its correct and there is room for improvement, im pretty new to
this) it would be much appreciated. I've had a simple look at the results
and they look fine, but seriously, 500x faster?!

function in R;
library(compiler)

Rrcomp <- cmpfun(
function(n, lam, nu, max = 100L) {
ans <- integer(n)
dist <- dcomp(0:max, lam, nu, max)
u <- runif(n)
for (i in 1:n) {
count <- 0L
pr <- dist[1L]
while (pr < u[i]) {
count <- count + 1L
pr <- pr + dist[count + 1L]
}
ans[i] <- count
}
return(ans)
}
)

dcomp <- function(y, lam, nu, max = 100L) {
Z <- function(lam, nu, max) {
sum <- 0L
for(i in 0L:max) {
sum <- sum + lam^i / factorial(i)^nu
}
return(sum)
}
return(lam^y / (factorial(y)^nu * Z(lam, nu, max)))
}

function in Rcpp;
header file;

#include 

RcppExport SEXP rcomp(SEXP n_, SEXP dist_);

source file;

#include "rcomp.h"

SEXP rcomp(SEXP n_, SEXP dist_) {
using namespace Rcpp ;

int n = as(n_);
NumericVector dist(dist_);

NumericVector ans(n);
int count;
double pr;
RNGScope scope;
NumericVector u = runif(n);

for (int i = 0; i < n; ++i) {
count = 0;
pr = dist[0];
while (pr < u[i]) {
count++;
pr += dist[count];
}
ans[i] = count;
}
return ans;
}

R call;

rcomp <- function(n, lam, nu, max = 100){
dist <- dcomp(0:max, lam, nu, max)
.Call("rcomp", n = n, dist = dist, PACKAGE = "glmcomp")
}

Here are some results;
> n <- 5
> lam <- 5
> nu <- 1
> rbind(table(rcomp(n, lam, nu))[1:10] / n, table(Rrcomp(n, lam, nu))[1:10]
/ n, dpois(0:9, lam))
   0  1  2 3 4
5 6
[1,] 0.00644 0.03124000 0.08452000 0.1392200 0.1747800 0.1755200
0.149
[2,] 0.00666 0.03232000 0.08412000 0.1425400 0.1779600 0.1748400
0.1445600
[3,] 0.006737947 0.03368973 0.08422434 0.1403739 0.1754674 0.1754674
0.1462228
 7  8  9
[1,] 0.1063000 0.06538000 0.03534000
[2,] 0.1039800 0.06492000 0.03624000
[3,] 0.109 0.06527804 0.03626558

(for nu = 1 the com-poisson distribution is the same as normal poisson)

> benchmark(rcomp(n, lam, nu), Rrcomp(n, lam, nu), replications = 10, order
= "relative")
test replications elapsed relative user.self sys.self
2 Rrcomp(n, lam, nu)   102.03   1.  1.96 0.00
1  rcomp(n, lam, nu)   10 1172.51 577.5911   1164.50 0.08

Thanks in advance if anyone has any time to have a look at this :)

Jeff

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Rcpp too good to be true?

2011-12-13 Thread Jeffrey Pollock
Hey Martin,

Thanks for the reply, that is a much better way!!!

On another note, I actually meant to post this on the rcpp-devel list (and
have forwarded it there, noting that there was a mistake in the original
post), however I'm quite glad I posted it here by mistake to see this
answer :)

Cheers, Jeff

On Wed, Dec 14, 2011 at 12:52 AM, Martin Morgan  wrote:

> On 12/13/2011 03:48 PM, Jeffrey Pollock wrote:
>
>> Hello all,
>>
>> I've been working on a package to do various things related to the
>> Conway-Maxwell-Poisson distribution and wanted to be able to make fast
>> random draws from the distribution. My R code was running quite slow so I
>> decided to give Rcpp a bash. I had used it before but only for extremely
>> basic stuff and always using inline. This time I decided to give making a
>> proper package a go.
>>
>> First of all I should say that this was incredibly easy due to
>> Rcpp.package.skeleton() and the countless answers to quesions online and
>> documentation!
>>
>> Secondly, I'm worried that my speedup has been so massive (over 500x !!!)
>> that I think I've made a mistake, hence my post here.
>>
>> Here is all my code, if someone has a minute to point out anything wrong
>> (or even if its correct and there is room for improvement, im pretty new
>> to
>> this) it would be much appreciated. I've had a simple look at the results
>> and they look fine, but seriously, 500x faster?!
>>
>> function in R;
>> library(compiler)
>>
>> Rrcomp<- cmpfun(
>> function(n, lam, nu, max = 100L) {
>> ans<- integer(n)
>> dist<- dcomp(0:max, lam, nu, max)
>> u<- runif(n)
>> for (i in 1:n) {
>> count<- 0L
>> pr<- dist[1L]
>> while (pr<  u[i]) {
>> count<- count + 1L
>> pr<- pr + dist[count + 1L]
>> }
>> ans[i]<- count
>> }
>> return(ans)
>> }
>> )
>>
>
> Hi Jeff
>
> Not really what you're asking about, but looks like you're sampling with
> replacement from the sequence 0:(dist-1) n times with probability dist, so
>
> Rrcomp.1 <-
>function(n, lam, nu, max = 100L)
>
> {
>dist <- dcomp(0:max, lam, nu, max)
>sample(seq_along(dist) - 1L, n, TRUE, prob=dist)
> }
>
> and
>
> > system.time(res <- table(Rrcomp(n, lam, nu))); res
>   user  system elapsed
>  1.493   0.000   1.495
>
>   0123456789   10   11   12   13
>  355 1656 4070 6976 8745 8861 7275 5214 3357 1892  926  399  165   69
>  14   15   16   17
>  24   1123
> > system.time(res <- table(Rrcomp.1(n, lam, nu))); res
>   user  system elapsed
>  0.029   0.000   0.028
>
>   0123456789   10   11   12   13
>  333 1754 4096 6876 8964 8799 7399 5030 3215 1877  951  432  184   61
>  14   15   16
>  2351
>
> Martin
>
>  dcomp<- function(y, lam, nu, max = 100L) {
>> Z<- function(lam, nu, max) {
>> sum<- 0L
>> for(i in 0L:max) {
>> sum<- sum + lam^i / factorial(i)^nu
>> }
>> return(sum)
>> }
>> return(lam^y / (factorial(y)^nu * Z(lam, nu, max)))
>> }
>>
>> function in Rcpp;
>> header file;
>>
>> #include
>>
>> RcppExport SEXP rcomp(SEXP n_, SEXP dist_);
>>
>> source file;
>>
>> #include "rcomp.h"
>>
>> SEXP rcomp(SEXP n_, SEXP dist_) {
>> using namespace Rcpp ;
>>
>> int n = as(n_);
>> NumericVector dist(dist_);
>>
>> NumericVector ans(n);
>> int count;
>> double pr;
>> RNGScope scope;
>> NumericVector u = runif(n);
>>
>> for (int i = 0; i<  n; ++i) {
>> count = 0;
>> pr = dist[0];
>> while (pr<  u[i]) {
>> count++;
>> pr += dist[count];
>> }
>> ans[i] = count;
>> }
>> return ans;
>> }
>>
>> R call;
>>
>> rcomp<- function(n, lam, nu, max = 100){
>> dist<- dcomp(0:max, lam, nu, max)
>> .Call("rcomp", n = n, dist = dist, PACKAGE = "glmcomp")
>> }
>>
>> Here are some results;
>>
>>> n<- 5
>>> lam<- 5
>>> nu<- 1
>>> rbind(table(rc