On 11-12-01 8:40 PM, Hervé Pagès wrote:
Hi,

FWIW:

/* Taken from R/src/main/unique.c */
static int requal(SEXP x, int i, SEXP y, int j)
{
      if (i<  0 || j<  0) return 0;
      if (!ISNAN(REAL(x)[i])&&  !ISNAN(REAL(y)[j]))
          return (REAL(x)[i] == REAL(y)[j]);
      else if (R_IsNA(REAL(x)[i])&&  R_IsNA(REAL(y)[j])) return 1;
      else if (R_IsNaN(REAL(x)[i])&&  R_IsNaN(REAL(y)[j])) return 1;
      else return 0;
}

/* Between 1.34x and 1.37x faster on my 64-bit Ubuntu laptop */
static int requal2(SEXP x, int i, SEXP y, int j)
{
      double xi, yj;

      if (i<  0 || j<  0) return 0;
      xi = REAL(x)[i];
      yj = REAL(y)[j];
      if (!ISNAN(xi)&&  !ISNAN(yj)) return xi == yj;
      if (R_IsNA(xi)&&  R_IsNA(yj)) return 1;
      if (R_IsNaN(xi)&&  R_IsNaN(yj)) return 1;
      return 0;
}

That looks like a valid improvement.


/* Another extra 1.18x speedup. So overall requal3() is about 1.6x
     faster than requal() for me. requal3() uses a simpler logic than
     requal() but this logic should be equivalent to the logic used
     by requal(), based on the following facts:
       (a) If *one* of xi or yi is a number (i.e. not NA or NaN),
           then xi and yi can be compared with xi == yi. They don't
           need to *both* be numbers for this comparison to be valid.
       (b) Otherwise (i.e. if each of them is not a number) then each
           of them is either NA or NaN (only 2 possible values for
           each), so comparing them with R_IsNA(xi) == R_IsNA(yj)
           should do the trick. */

I think this one is probably correct, but it's too tricky for my taste.

static int requal3(SEXP x, int i, SEXP y, int j)
{
      double xi, yj;

      if (i<  0 || j<  0) return 0;
      xi = REAL(x)[i];
      yj = REAL(y)[j];
      if (!ISNAN(xi) || !ISNAN(yj)) return xi == yj;
      return R_IsNA(xi) == R_IsNA(yj);
}

Duncan Murdoch


The logic of the cequal() function (in the same file) could also be
cleaned up in a similar way, probably for an even greater speedup.

This will benefit duplicated(), anyDuplicated() and unique() on numeric
and complex vectors.

Cheers,
H.


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to