Re: [Rd] bias issue in sample() (PR 17494)

2019-02-19 Thread Gabriel Becker
Luke, I'm happy to help with this. Its great to see this get tackled (I've cc'ed Kelli Ottoboni who helped flag this issue). I can prepare a patch for the RNGkind related stuff and the doc update. As for ???, what are your (and others') thoughts about the possibility of a) a reproducibility API

Re: [Rd] Documentation for sd (stats) + suggestion

2019-02-19 Thread Dario Strbenac
Good day, It is implemented by the CRAN package multicon. The function is named popsd. But it does seem like something R should provide without creating a package dependency. -- Dario Strbenac University of Sydney Camperdown NSW 2050 Australia ___

Re: [Rd] mle (stat4) crashing due to singular Hessian in covariance matrix calculation

2019-02-19 Thread Ben Bolker
I don't know if this will get much response from the R developers; they might just recommend that you protect your mle() call in a try() or tryCatch() to stop it from breaking your loop. Alternatively, you could try mle2() function in the bbmle package, which started out long ago as a slightly

Re: [Rd] code for sum function

2019-02-19 Thread Ben Bolker
This SO question may be of interest: https://stackoverflow.com/questions/38589705/difference-between-rs-sum-and-armadillos-accu/ which points out that sum() isn't doing anything fancy *except* using extended-precision registers when available. (Using Kahan's algorithm does come at a computat

[Rd] patch for gregexpr(perl=TRUE)

2019-02-19 Thread Toby Hocking
Hi all, Several people have noticed that gregexpr is very slow for large subject strings when perl=TRUE is specified. - https://stackoverflow.com/questions/31216299/r-faster-gregexpr-for-very-large-strings - http://r.789695.n4.nabble.com/strsplit-perl-TRUE-gregexpr-perl-TRUE-very-slow-for-long-str

[Rd] bias issue in sample() (PR 17494)

2019-02-19 Thread Tierney, Luke
Before the next release we really should to sort out the bias issue in sample() reported by Ottoboni and Stark in https://www.stat.berkeley.edu/~stark/Preprints/r-random-issues.pdf and filed aa a bug report by Duncan Murdoch at https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17494. Here are

Re: [Rd] code for sum function

2019-02-19 Thread William Dunlap via R-devel
The algorithm does make a differece. You can use Kahan's summation algorithm (https://en.wikipedia.org/wiki/Kahan_summation_algorithm) to reduce the error compared to the naive summation algorithm. E.g., in R code: naiveSum <- function(x) { s <- 0.0 for(xi in x) s <- s + xi s } kahanSum

Re: [Rd] code for sum function

2019-02-19 Thread Paul Gilbert
(I didn't see anyone else answer this, so ...) You can probably find the R code in src/main/ but I'm not sure. You are talking about a very simple calculation, so it seems unlike that the algorithm is the cause of the difference. I have done much more complicated things and usually get machine

[Rd] mle (stat4) crashing due to singular Hessian in covariance matrix calculation

2019-02-19 Thread Francisco Matorras
Hi, R developers. when running mle inside a loop I found a nasty behavior. From time to time, my model had a degenerate minimum and the loop just crashed. I tracked it down to "vcov <- if (length(coef)) solve(oout$hessian)" line, being the hessian singular. Note that the minimum reached was goo

Re: [Rd] Documentation for sd (stats) + suggestion

2019-02-19 Thread S Ellison
> As far as I can tell, the manual help page for ``sd`` > > ?sd > > does not explicitly mention that the formula for the standard deviation is > the so-called "Bessel-corrected" formula (divide by n-1 rather than n). See Details, where it says "Details: Like 'var' this uses denominator n

[Rd] Documentation for sd (stats) + suggestion

2019-02-19 Thread PatrickT
I cannot file suggestions on bugzilla, so writing here. As far as I can tell, the manual help page for ``sd`` ?sd does not explicitly mention that the formula for the standard deviation is the so-called "Bessel-corrected" formula (divide by n-1 rather than n). I suggest it should be stated near