Below is some further troubleshooting on this: >From code inspection this bug happens for only:
* for integer values * when the first element is NA_integer_ and the second is not. Examples: # Numeric/doubles works as expected > cummax(c(NA_real_, 0, 1, 2, 3)) [1] NA NA NA NA NA # It does not occur when the first value is non-NA > cummax(c(0L, NA_integer_, 1L, 2L, 3L)) [1] 0 NA NA NA NA # When first value is NA, it is not "remembered" # (because internal for loop starts with 2nd element) > cummax(c(NA_integer_, 0L, 1L, 2L, 3L)) [1] NA 0 1 2 3 The problem is not there for cummin(): > cummin(c(0L, NA_integer_, 1L, 2L, 3L)) [1] 0 NA NA NA NA > cummin(c(NA_integer_, 0L, 1L, 2L, 3L)) [1] NA NA NA NA NA but that is just "pure luck" due to the fact how NA_integer_ is internally represented as the smallest possible 4-byte signed integer, i.e. LibExtern int R_NaInt; /* NA_INTEGER:= INT_MIN currently */ #define NA_INTEGER R_NaInt Note the comment, which implies that code should not rely on the assumption that NA_integer_ == NA_INTEGER == R_NaInt == INT_MIN; it could equally well have been INT_MAX, which in case cummin()would return the wrong result whereas cummax() wouldn't. So, cummin() makes the same mistake ascummax(), where the for-loop skips the test for NA of the first element, cf. https://github.com/wch/r-source/blob/trunk/src/main/cum.c#L145-L148 The simple solution is probably to do (cf. native icumsum): [HB-X201]{hb}: svn diff src/main/cum.c Index: src/main/cum.c =================================================================== --- src/main/cum.c (revision 68378) +++ src/main/cum.c (working copy) @@ -130,7 +130,7 @@ int *ix = INTEGER(x), *is = INTEGER(s); int max = ix[0]; is[0] = max; - for (R_xlen_t i = 1 ; i < xlength(x) ; i++) { + for (R_xlen_t i = 0 ; i < xlength(x) ; i++) { if(ix[i] == NA_INTEGER) break; is[i] = max = (max > ix[i]) ? max : ix[i]; } @@ -142,7 +142,7 @@ int *ix = INTEGER(x), *is = INTEGER(s); int min = ix[0]; is[0] = min; - for (R_xlen_t i = 1 ; i < xlength(x) ; i++ ) { + for (R_xlen_t i = 0 ; i < xlength(x) ; i++ ) { if(ix[i] == NA_INTEGER) break; is[i] = min = (min < ix[i]) ? min : ix[i]; } /Henrik On Sun, May 17, 2015 at 4:13 AM, Dongcan Jiang <dongcan.ji...@gmail.com> wrote: > Hi, > > The function cummax() seems to have a bug. > >> x <- c(NA, 0) >> storage.mode(x) <- "integer" >> cummax(x) > [1] NA 0 > > The correct result of this case should be NA NA. The mistake in > [https://github.com/wch/r-source/blob/trunk/src/main/cum.c#L130-L136] may be > the reason. > > Best Regards, > Dongcan > > -- > Dongcan Jiang > Team of Search Engine & Web Mining > School of Electronic Engineering & Computer Science > Peking University, Beijing, 100871, P.R.China ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel