[Rd] Bug report: Function ppois(0:20, lambda=0.9) does not generate a non-decreasing result.
function ppois is a function calculate the CDF of Poisson distribution, it should generate a non-decreasing result, but what I got is: > any(diff(ppois(0:19,lambda=0.9))<0) [1] TRUE Actually, > ppois(19,lambda=0.9)https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Bug report: Function ppois(0:20, lambda=0.9) does not generate a non-decreasing result.
On Tue, 4 Dec 2018 at 11:12, wrote: > > function ppois is a function calculate the CDF of Poisson distribution, it > should generate a non-decreasing result, but what I got is: > > > any(diff(ppois(0:19,lambda=0.9))<0) > [1] TRUE > > Actually, > > > ppois(19,lambda=0.9) [1] TRUE > > Which could not be TRUE. This is just another manifestation of 0.1 * 3 > 0.3 #> [1] TRUE This discussion returns to this list from time to time. TLDR; this is not an R issue, but an unavoidable floating point issue. Solution: work with log-probabilities instead. any(diff(ppois(0:40, lambda=0.9, log.p=TRUE))<0) #> [1] FALSE Iñaki __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Bug report: Function ppois(0:20, lambda=0.9) does not generate a non-decreasing result.
Le 04/12/2018 à 11:27, Iñaki Ucar a écrit : On Tue, 4 Dec 2018 at 11:12, wrote: function ppois is a function calculate the CDF of Poisson distribution, it should generate a non-decreasing result, but what I got is: any(diff(ppois(0:19,lambda=0.9))<0) [1] TRUE Actually, ppois(19,lambda=0.9) [1] TRUE Which could not be TRUE. This is just another manifestation of 0.1 * 3 > 0.3 #> [1] TRUE This discussion returns to this list from time to time. TLDR; this is not an R issue, but an unavoidable floating point issue. Well, here the request may be interpreted not as "do it without round error" which is indeed unavoidable but rather "please cope with rounding errors in a way that return consistent result for ppois()". You have indicated one way to do so (I have just added exp() in the row): any(diff(exp(ppois(0:19, lambda=0.9, log.p=TRUE))) < 0) #[1] FALSE But may be there is another, more economic way? Serguei. Solution: work with log-probabilities instead. any(diff(ppois(0:40, lambda=0.9, log.p=TRUE))<0) #> [1] FALSE Iñaki __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Serguei Sokol Ingenieur de recherche INRA Cellule mathématiques LISBP, INSA/INRA UMR 792, INSA/CNRS UMR 5504 135 Avenue de Rangueil 31077 Toulouse Cedex 04 tel: +33 5 62 25 01 27 email: so...@insa-toulouse.fr http://www.lisbp.fr __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Bug report: Function ppois(0:20, lambda=0.9) does not generate a non-decreasing result.
I do think it's plausible to expect that we could get *non-decreasing* results. I get any(diff(exp(ppois(0:19, lambda=0.9, log.p=TRUE)))<0) as FALSE. But I do get diff(ppois(18:19, lambda=0.9)) < 0. Looking at the code of ppois, it's just (within C code) calling pgamma with pgamma(lambda, shape=1+x, scale=1, lower.tail=FALSE): identical( ppois(18:19,0.9), pgamma(0.9,shape=1+(18:19),scale=1,lower.tail=FALSE) ) is TRUE. So the problem (if we choose to define it as such) would be in pgamma (upper tail should be a non-decreasing function of the shape parameter) ... the code is here https://github.com/wch/r-source/blob/5a156a0865362bb8381dcd69ac335f5174a4f60c/src/nmath/pgamma.c if anyone wants to dig in ... On 2018-12-04 5:46 a.m., Serguei Sokol wrote: > Le 04/12/2018 à 11:27, Iñaki Ucar a écrit : >> On Tue, 4 Dec 2018 at 11:12, wrote: >>> function ppois is a function calculate the CDF of Poisson >>> distribution, it should generate a non-decreasing result, but what I >>> got is: >>> any(diff(ppois(0:19,lambda=0.9))<0) >>> [1] TRUE >>> >>> Actually, >>> ppois(19,lambda=0.9)>> [1] TRUE >>> >>> Which could not be TRUE. >> This is just another manifestation of >> >> 0.1 * 3 > 0.3 >> #> [1] TRUE >> >> This discussion returns to this list from time to time. TLDR; this is >> not an R issue, but an unavoidable floating point issue. > Well, here the request may be interpreted not as "do it without round > error" which is indeed unavoidable but rather "please cope with rounding > errors in a way that return consistent result for ppois()". You have > indicated one way to do so (I have just added exp() in the row): > > any(diff(exp(ppois(0:19, lambda=0.9, log.p=TRUE))) < 0) > #[1] FALSE > > But may be there is another, more economic way? > > Serguei. > >> Solution: >> work with log-probabilities instead. >> >> any(diff(ppois(0:40, lambda=0.9, log.p=TRUE))<0) >> #> [1] FALSE >> >> Iñaki >> >> __ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Bug report: Function ppois(0:20, lambda=0.9) does not generate a non-decreasing result.
> Serguei Sokol > on Tue, 4 Dec 2018 11:46:32 +0100 writes: > Le 04/12/2018 à 11:27, Iñaki Ucar a écrit : >> On Tue, 4 Dec 2018 at 11:12, wrote: >>> function ppois is a function calculate the CDF of Poisson distribution, it should generate a non-decreasing result, but what I got is: >>> any(diff(ppois(0:19,lambda=0.9))<0) >>> [1] TRUE >>> >>> Actually, >>> ppois(19,lambda=0.9)>> [1] TRUE >>> >>> Which could not be TRUE. >> This is just another manifestation of >> >> 0.1 * 3 > 0.3 >> #> [1] TRUE >> >> This discussion returns to this list from time to time. TLDR; this is >> not an R issue, but an unavoidable floating point issue. > Well, here the request may be interpreted not as "do it without round > error" which is indeed unavoidable but rather "please cope with rounding > errors in a way that return consistent result for ppois()". You have > indicated one way to do so (I have just added exp() in the row): > any(diff(exp(ppois(0:19, lambda=0.9, log.p=TRUE))) < 0) > #[1] FALSE > But may be there is another, more economic way? Well, log probabilites *are* very economic for many such p*() functions. OTOH, I'm a bit surprised that nobody mentioned the 'lower.tail=FALSE' option which here makes so very much sense, and is I think slightly more intuitive than the log-probabilities: It gives much much much more accurate results for such outermost right tail probabilities where p*() ~= 1 : > ppois(15:19, lambda=0.9) [1] 1 1 1 1 1 > ppois(15:19, lambda=0.9, lower.tail=FALSE) [1] 3.801404e-15 2.006332e-16 1.000417e-17 4.727147e-19 2.122484e-20 > ... and if you compare with > ppois(15:19, lambda=0.9, log.p=TRUE) and notice how similar the numbers are, you may remember that indeed log(1-p) ~= -p when |p| ≪ 1 Martin > Serguei. >> Solution: >> work with log-probabilities instead. or >> >> any(diff(ppois(0:40, lambda=0.9, log.p=TRUE))<0) >> #> [1] FALSE >> >> Iñaki >> >> __ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > -- > Serguei Sokol > Ingenieur de recherche INRA > Cellule mathématiques > LISBP, INSA/INRA UMR 792, INSA/CNRS UMR 5504 > 135 Avenue de Rangueil > 31077 Toulouse Cedex 04 > tel: +33 5 62 25 01 27 > email: so...@insa-toulouse.fr > http://www.lisbp.fr > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] order(decreasing=c(TRUE,FALSE),...)
The NEWS file for R-devel (as of 2018-11-28 r75702) says • order(, decreasing=c(TRUE,FALSE)) could fail in some cases. Reported from StackOverflow via Karl Nordström. However, either I don't understand the meaning of decreasing=c(TRUE,FALSE) or there are still problems. I thought order(x,y,decreasing=c(TRUE,FALSE) meant to return indices, i, such that x[i] was non-increasing and that ties among the x's would be broken by y in non-decreasing order. E.g., that interpretation works for numeric vectors: > d <- data.frame(X=c(2,1,2,1,2,2,1), N=c(4:7,1:3)) > d[order(d$X, d$N, decreasing=c(TRUE, FALSE)), ] # expect decreasing X and, within group of tied Xes, increasing N X N 5 2 1 6 2 2 1 2 4 3 2 6 7 1 3 2 1 5 4 1 7 But it fails for character vectors: E.g., add some of those that have the same sort order as 'N': > d$Char <- LETTERS[d$N] > identical(order(d$N), order(d$Char)) # expect TRUE [1] TRUE I expected the new columns to give the same sort order when they replace 'd$N' in the first call to order, but they do not: It acts like it would with decreasing=c(TRUE,TRUE). > order(d$X, d$Char, decreasing=c(TRUE, FALSE)) [1] 3 1 6 5 4 2 7 > d[order(d$X, d$Char, decreasing=c(TRUE, FALSE)), ] X N Char 3 2 6F 1 2 4D 6 2 2B 5 2 1A 4 1 7G 2 1 5E 7 1 3C Bill Dunlap TIBCO Software wdunlap tibco.com [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] patch to support custom HTTP headers in download.file() and url()
The patch below adds support for custom HTTP headers in download.file() and url(). My main motivation for this is performing basic http authentication. Some web sites do not support embedding the credentials into the URI itself, they only work if the username and password are sent in the HTTP headers. In fact specifying the username and password in the URI has been deprecated.(https://en.wikipedia.org/wiki/Basic_access_authentication#URL_encoding) Unfortunately this means that download.file() and url() cannot access these password protected URLs. This patch fixes that. I am happy to update the patch as needed. Details: * This patch adds supports for custom HTTP headers in download.file() and url(). * They both get a headers = NULL argument. * This is implemented for the internal, wininet and libcurl methods. * For other methods headers is silently ignored. * For non-HTTP URLs headers is silently ignored. * The headers argument must be a named character vector without NAs, or NULL. * If headers is not named or it contains NAs, or the names contain NAs, an error is thrown. * For download.file() the method is chosen in R, and we a character vector to C for libcurl and a collapsed string constant for internal and wininet. * For url() the method is only chosen in C, so we pass both a string vector and the collapsed string vector to C. This is simpler than collapsing in C. * It is not possible to specify headers for file(), even though it handles URLs. * The user agent (coming from the HTTPUserAgent options), will the the first header, for the methods that need it together with the other headers. * We don't check for duplicate headers, just pass to the methods as the user specified them. * We test all methods. * We have run the tests on macOS, Debian Linux and Windows 2016 Server. You can also browse the changes here: https://github.com/gaborcsardi/r-source/pull/3/files You can also download the diff below from https://github.com/gaborcsardi/r-source/pull/3.diff Best, Gabor diff --git a/src/include/Rconnections.h b/src/include/Rconnections.h index a2c53f058f..32bb35e31f 100644 --- a/src/include/Rconnections.h +++ b/src/include/Rconnections.h @@ -36,6 +36,7 @@ typedef enum {HTTPsh, FTPsh, HTTPSsh, FTPSsh} UrlScheme; typedef struct urlconn { void *ctxt; UrlScheme type; +char *headers; } *Rurlconn; /* used in internet module */ @@ -67,7 +68,7 @@ Rconnection getConnection_no_err(int n); Rboolean switch_stdout(int icon, int closeOnExit); void init_con(Rconnection new, const char *description, int enc, const char * const mode); -Rconnection R_newurl(const char *description, const char * const mode, int type); +Rconnection R_newurl(const char *description, const char * const mode, SEXP headers, int type); Rconnection R_newsock(const char *host, int port, int server, const char * const mode, int timeout); Rconnection in_R_newsock(const char *host, int port, int server, const char *const mode, int timeout); Rconnection R_newunz(const char *description, const char * const mode); diff --git a/src/include/Rmodules/Rinternet.h b/src/include/Rmodules/Rinternet.h index 619992eeda..5f02b78514 100644 --- a/src/include/Rmodules/Rinternet.h +++ b/src/include/Rmodules/Rinternet.h @@ -25,10 +25,10 @@ typedef SEXP (*R_DownloadRoutine)(SEXP args); -typedef Rconnection (*R_NewUrlRoutine)(const char *description, const char * const mode, int method); +typedef Rconnection (*R_NewUrlRoutine)(const char *description, const char * const mode, SEXP headers, int method); typedef Rconnection (*R_NewSockRoutine)(const char *host, int port, int server, const char *const mode, int timeout); -typedef void * (*R_HTTPOpenRoutine)(const char *url, const char *headers, const int cacheOK); +typedef void * (*R_HTTPOpenRoutine)(const char *url, const char *agent, const char *headers, const int cacheOK); typedef int(*R_HTTPReadRoutine)(void *ctx, char *dest, int len); typedef void (*R_HTTPCloseRoutine)(void *ctx); diff --git a/src/library/base/R/connections.R b/src/library/base/R/connections.R index 7445d2327b..50c0ea0a1c 100644 --- a/src/library/base/R/connections.R +++ b/src/library/base/R/connections.R @@ -91,10 +91,18 @@ fifo <- function(description, open = "", blocking = FALSE, url <- function(description, open = "", blocking = TRUE, encoding = getOption("encoding"), -method = getOption("url.method", "default")) +method = getOption("url.method", "default"), +headers = NULL) { method <- match.arg(method, c("default", "internal", "libcurl", "wininet")) -.Internal(url(description, open, blocking, encoding, method)) +if (!is.null(headers)) { + if (length(names(headers)) != length(headers) || + any(names(headers) == "") || anyNA(headers) || anyNA(names(headers))) +stop("'headers' must must have names and must not be NA") + headers <- paste0(names(headers), ": ", headers) + headers <- list(hea