[Rd] The function cummax() seems to have a bug.
Hi, The function cummax() seems to have a bug. > x <- c(NA, 0) > storage.mode(x) <- "integer" > cummax(x) [1] NA 0 The correct result of this case should be NA NA. The mistake in [ https://github.com/wch/r-source/blob/trunk/src/main/cum.c#L130-L136] may be the reason. Best Regards, Dongcan -- Dongcan Jiang Team of Search Engine & Web Mining School of Electronic Engineering & Computer Science Peking University, Beijing, 100871, P.R.China [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] That 'make check-all' problem with the survival package
-- On Sat, May 16, 2015 2:33 PM BST Marc Schwartz wrote: > >> On May 16, 2015, at 6:11 AM, Hin-Tak Leung >> wrote: >> >> >> >> -- >> On Sat, May 16, 2015 8:04 AM BST Uwe Ligges wrote: >> >> Not sure why this goes to R-devel. You just could have asked the >> maintainer. Terry Therneau is aware of it and promised he will fix it. >> >> >> The quickest fix is to add cmprsk to the recommended list , and that's is an >> R-devel issue. > > >Actually, the easiest solution would be for Terry to either modify relevant >code according to: > > >http://cran.r-project.org/doc/manuals/r-release/R-exts.html#Suggested-packages > >or perhaps better, as noted, to remove the need for cmprsk at all. > We shall disagree on what is "quickest" - one other option I did not mention is to back out the version upgrade to 2.38-1 in February, and ship the previous 2.37-7. I trust that Terry made the addition for good reasons. So my thoughts are geared more towards how to accommodate the new addition, rather than revert/remove it. (though I'd be happier to isolate a smaller change than a bulk 2.37-7 -> 2.38.1 upgrade and think more about that. Hence the mention of the repositories.). >The latter raises the philosophical issue as to whether or not a Recommended >package should or really needs to have any connections at all to third party >CRAN packages. It seems to me that the default R distribution of “Base” and >“Recommended” packages should be able to fully run in a stand alone manner >without any declared external connections to other non-default packages. > >The only other Recommended package that I found that has such a connection is >nlme, which has a Suggests for Frank’s Hmisc. However, based upon a grep >review of the package contents, there are only two code based references to >Hmisc that I located: > >./R/newFunc.R:264: ## e.g. Hmisc's "labelled" >./tests/augPred_lab.R:2:if(require("Hmisc")) { > >Neither of which is really needed (the first being a comment only, so benign) >and of course the latter goes against the current guidance in R-exts. The >second, within the if() code, uses the Hmisc label() function, which it seems >to me is not really needed here. > >Regards, > >Marc Schwartz > > >> >> On 16.05.2015 07:22, Hin-Tak Leung wrote: >>> 'make check-all' for current R has been showing this error in the middle >>> for a few months now - any thought on fixing this? I think cmprsk >>> should be either included in the recommended bundle, or >>> the survival vignette to not depend on it. Having 'make check-all' showing >>> glaring ERROR's for a few months seems to defeat the purpose of >>> doing any checking at all via 'make check-all'. >>> >>> FWIW, I did look at when/how the issue was introduced, but it appeared >>> that svn://svn.r-forge.r-project.org/svnroot/survival is no longer being >>> updated, and git://github.com/cran/survival.git only shows release jumps. >>> Anyway, if first appears with survival 2.38-1 in February, and as the >>> previous >>> 2.37-7 was 13 months older, this info is of no use to anybody. >>> I didn't write earlier as I thought the issue would go away at some point; >>> but obviously this isn't the case after 3 months. >>> >>> --- >>> ERROR >>> Errors in running code in vignettes: >>> when running code in ‘compete.Rnw’ >>> ... >>> temp$fstat <- as.numeric(event) >>> >>> temp$msex <- with(temp, 1 * (sex == "M")) >>> >>> fgfit1 <- with(temp, crr(etime, fstat, cov1 = cbind(age, >>> + msex, mspike), failcode = 2, cencode = 1, variance = TRUE)) >>> >>> When sourcing ‘compete.R’: >>> Error: could not find function "crr" >>> Execution halted >>> >>> * checking re-building of vignette outputs ... NOTE >>> Error in re-building vignettes: >>> ... >>> Warning in coxph(Surv(futime, death) ~ group:age2 + sex + strata(group), : >>> X matrix deemed to be singular; variable 23 24 25 >>> Loading required package: cmprsk >>> Warning in library(package, lib.loc = lib.loc, character.only = TRUE, >>> logical.return = TRUE, : >>> there is no package called ‘cmprsk’ >>> >>> Error: processing vignette 'compete.Rnw' failed with diagnostics: >>> chunk 15 (label = finegray) >>> Error in eval(expr, envir, enclos) : could not find function "crr" >>> Execution halted >>> >>> __ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> >> >> __ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] install.packages() / update.packages() sometimes outputs to stdout and sometimes to stderr [and menu() & readline()]
I've noticed that install.packages() [https://svn.r-project.org/R/trunk/src/library/utils/R/packages.R] and update.packages() [https://svn.r-project.org/R/trunk/src/library/utils/R/packages2.R] sometimes output to stdout and sometimes to stderr. It looks like stderr is used (e.g. via cat()) when the message is part of querying the user, e.g. update.packages <- function(lib.loc = NULL, repos = getOption("repos"), [...] cat(old[k, "Package"], ":\n", "Version", old[k, "Installed"], "installed in", old[k, "LibPath"], if(checkBuilt) paste("built under R", old[k, "Built"]), "\n", "Version", old[k, "ReposVer"], "available at", simplifyRepos(old[k, "Repository"], type)) cat("\n") answer <- substr(readline("Update (y/N/c)? "), 1L, 1L) if(answer == "c" | answer == "C") { cat("cancelled by user\n") return(invisible()) } but it is not consistently so, because some are sent to stderr (e.g. via message()), e.g. install.packages <- function(pkgs, lib, repos = getOption("repos"), [...] if(action == "interactive" && interactive()) { msg <- ngettext(sum(later & hasSrc), "Do you want to install from sources the package which needs compilation?", "Do you want to install from sources the packages which need compilation?") message(msg, domain = NA) res <- readline("y/n: ") if(res != "y") later <- later & !hasSrc } else if (action == "never") { cat(" Binaries will be installed\n") later <- later & !hasSrc } Also, as one see in the latter example, it is not only interactive user queries for which stdout is used. It's simply not consistent - at least I cannot see pattern. If these are not intended behaviors, I'm happy to provide patches. I'd prefer stderr for all user queries (see below) - but I can also see how this is something that needs considered thoughts and made an official design policy across the R base distribution. [Related to the above but could deserve it's own thread (feel free to move the below to its own thread)] utils::menu(..., graphics=FALSE) [https://svn.r-project.org/R/trunk/src/library/utils/R/menu.R] queries the user via standard output, which becomes an issue when running interactive report generators, which mostly captures stdout and makes it part of the produced artifact. Personally, I'd argue that querying the user via stderr would be a better choice in more cases. Also, it's a bit weird that base::readline(), which is used for the actual prompting of the user, is sent neither to R's stdout nor stderr, e.g. > zz <- file("all.Rout", open = "wt") > sink(zz); sink(zz, type = "message") > ans <- menu(letters[1:3], title="Select one:", graphics=FALSE) Selection: 1 > sink(type = "message"); sink() > ans [1] 1 > cat(readLines("all.Rout"), sep="\n") Select one: 1: a 2: b 3: c Note the only thing displayed to the user is the prompt "Selection: ", which is generated by readline(). It does however output to the system's stdout (verified on Linux and Windows), e.g. $ Rscript -e "readline('Press ENTER: ')" > stdout.log $ cat stdout.log Press ENTER: [1] "" Compare this to how it works in, for instance, bash: $ read -p "Press ENTER: " ans > stdout.log Press ENTER: $ read -p "Press ENTER: " ans > stderr.log $ cat stderr.log Press ENTER: My preference would be that menu() and readline() and other messages for querying the user would output to the same connection/stream. Again, I'd favor stderr over stdout, but possibly even a third alternative designed specifically for user queries, cf. my R-devel post '[Rd] Output to "raw console" rather than stdout/stderr?' on 2015-02-01 (https://stat.ethz.ch/pipermail/r-devel/2015-February/070578.html). Just like when using menu(..., graphics=TRUE), this would not clutter up the output to stdout (or stderr). But even without this third alternative, I argue that stderr is better than stdout. I'm happy to provide patches for this as well. Thanks, Henrik __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] The function cummax() seems to have a bug.
Below is some further troubleshooting on this: >From code inspection this bug happens for only: * for integer values * when the first element is NA_integer_ and the second is not. Examples: # Numeric/doubles works as expected > cummax(c(NA_real_, 0, 1, 2, 3)) [1] NA NA NA NA NA # It does not occur when the first value is non-NA > cummax(c(0L, NA_integer_, 1L, 2L, 3L)) [1] 0 NA NA NA NA # When first value is NA, it is not "remembered" # (because internal for loop starts with 2nd element) > cummax(c(NA_integer_, 0L, 1L, 2L, 3L)) [1] NA 0 1 2 3 The problem is not there for cummin(): > cummin(c(0L, NA_integer_, 1L, 2L, 3L)) [1] 0 NA NA NA NA > cummin(c(NA_integer_, 0L, 1L, 2L, 3L)) [1] NA NA NA NA NA but that is just "pure luck" due to the fact how NA_integer_ is internally represented as the smallest possible 4-byte signed integer, i.e. LibExtern intR_NaInt; /* NA_INTEGER:= INT_MIN currently */ #define NA_INTEGER R_NaInt Note the comment, which implies that code should not rely on the assumption that NA_integer_ == NA_INTEGER == R_NaInt == INT_MIN; it could equally well have been INT_MAX, which in case cummin()would return the wrong result whereas cummax() wouldn't. So, cummin() makes the same mistake ascummax(), where the for-loop skips the test for NA of the first element, cf. https://github.com/wch/r-source/blob/trunk/src/main/cum.c#L145-L148 The simple solution is probably to do (cf. native icumsum): [HB-X201]{hb}: svn diff src/main/cum.c Index: src/main/cum.c === --- src/main/cum.c (revision 68378) +++ src/main/cum.c (working copy) @@ -130,7 +130,7 @@ int *ix = INTEGER(x), *is = INTEGER(s); int max = ix[0]; is[0] = max; -for (R_xlen_t i = 1 ; i < xlength(x) ; i++) { +for (R_xlen_t i = 0 ; i < xlength(x) ; i++) { if(ix[i] == NA_INTEGER) break; is[i] = max = (max > ix[i]) ? max : ix[i]; } @@ -142,7 +142,7 @@ int *ix = INTEGER(x), *is = INTEGER(s); int min = ix[0]; is[0] = min; -for (R_xlen_t i = 1 ; i < xlength(x) ; i++ ) { +for (R_xlen_t i = 0 ; i < xlength(x) ; i++ ) { if(ix[i] == NA_INTEGER) break; is[i] = min = (min < ix[i]) ? min : ix[i]; } /Henrik On Sun, May 17, 2015 at 4:13 AM, Dongcan Jiang wrote: > Hi, > > The function cummax() seems to have a bug. > >> x <- c(NA, 0) >> storage.mode(x) <- "integer" >> cummax(x) > [1] NA 0 > > The correct result of this case should be NA NA. The mistake in > [https://github.com/wch/r-source/blob/trunk/src/main/cum.c#L130-L136] may be > the reason. > > Best Regards, > Dongcan > > -- > Dongcan Jiang > Team of Search Engine & Web Mining > School of Electronic Engineering & Computer Science > Peking University, Beijing, 100871, P.R.China __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel