Re: [Rd] efficiency of sample() with prob.

Bo Peng Fri, 24 Jun 2005 08:34:26 -0700

On 6/24/05, Prof Brian Ripley <[EMAIL PROTECTED]> wrote:
> `Research' involves looking at all the competitor methods, devising a
> near-optimal strategy and selecting amongst methods according to that
> strategy.  It is not a quick fix we are looking for but something that
> will be good for the long term.
>


I am sorry but I am afraid that I do not have enough time and
background knowledge
to do a thorough research in this area. I have tried bisection search
method and the alias
method, the latter has greatly improved the performance of my bioinformatics
application. Since this method is the only one mentioned in Knuth's
book, I have no
idea about other alternatives. 

Attached is a slightly improved version of the alias method. It may be
helpful to people
having similar problems.

Thanks.

--
Bo Peng
Department of Statistics
Rice University.
http://bp6.stat.rice.edu:8080/

/* replace probSampleReplace in src/main/random.c */
static void ProbSampleReplace(int n, double *p, int *perm, int nans, int *ans)
{
    /* allocate memory for a, p, H and L is the second half of a  */
    double * q = Calloc(n, double);
    int * a = Calloc(2*n, int);
    int * LL = a+n, * HH = a+2*n-1;  /* start of H and L vector */
    int * L = LL, * H = HH;          /* end of H and L vector */
    int i, j, k;
    double rU;  /* U[0,1)*n */

    /* set up alias table */
    /* initialize q with n*p0,...n*p_n-1 */
    for(i=0; i<n; ++i)
        q[i] = p[i]*n;

    /* initialize a with indices */
    for(i=0; i<n; ++i)
        a[i] = i;

    /* set up H and L */
    for(i=0; i<n; ++i) {
        if( q[i] >= 1.)
            *H-- = i;
        else
            *L++ = i;
    }

    while( L != LL && H != HH ) {
        j = *(L-1);
        k = *(H+1);
        a[j] = k;
        q[k] += q[j] - 1;
        L--;                                  /* remove j from L */
        if( q[k] < 1. ) {
            *L++ = k;                         /* add k to L */
            ++H;                              /* remove k */
        }
    }

    /* generate sample */
    for (i = 0; i < nans; ++i) {
        rU = unif_rand() * n;

        k = (int)(rU);
        rU -= k;  /* rU becomes rU-[rU] */

        if( rU < q[k] )
            ans[i] = k+1;
        else
            ans[i] = a[k]+1;
    }
    Free(a);
    Free(q); 
}

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] efficiency of sample() with prob.

Reply via email to