On 05/06/2019 4:34 a.m., le Gleut, Ronan wrote:
Dear R-help mailing list,
First of all, many many thanks for your great work on the R project!
I have a very small issue regarding the sample function. Depending if we
specify values for the prob argument, we don't get the same result for a
random sampling with replacement and with equal probabilities. See the
attached R code for a minimal example with the R version 3.6.0.
With a previous R version (3.5.x), the result was just a permutation
between the possible realizations. They are now totally different with the
latest R version.
I understand that if we specify or not the prob argument, two different
internal functions are used: .Internal(sample()) or .Internal(sample2()).
Indeed, the algorithm used to draw a sample may not be the same if by
default we assume equal probabilities (without the prob argument) or if
the user defines himself the probabilities (even if they are equal).
I found this post on stackoverflow which explains the reasons of this
difference (answer by Matthew Lundberg):
https://stackoverflow.com/questions/23316729/r-sample-probabilities-defaul
t-is-equal-weight-why-does-specifying-equal-weigh
I was wondering whether the solution proposed by PatrickT could solve this
issue? He proposed to have something like if(all.equal(prob, prob,
tolerance = .Machine$double.eps) prob = NULL inside the sample.int routine
in order to replicate prob=NULL with prob=rep(1, length(x)).
R has never promised that these will be the same, so I doubt if R will
change the sample() function. However, it's very easy for you to adopt
something like PatrickT's solution for yourself. Just use this function:
PatrickTsample <- function(x, size, replace = FALSE, prob = NULL) {
if (!is.null(prob) && max(prob) == min(prob))
prob <- NULL
sample(x = x, size = size, replace = replace, prob = prob)
}
You might want a looser tolerance on the vector of probabilities
depending on your context.
Duncan Murdoch
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.