Lo, Ken <ken.lo <at> roche.com> writes:
Hi all,
I am running into a snag using quantile function in stats. Basically, I
don't understand why the loop below throws the error that it does.
test.data <- rnorm(1000, 0, 1)
for (i in seq(0.00001, 0.001, 0.00001)){
test <- quantile(test.data, probs=seq(0,1,i));
print(i);
}
It runs fine from 1e-05 to 0.00024, but then throws the error
Error in quantile.default(test.data, probs=seq(0,1,i)):
'probs' outside [0,1]
I tested it some more by using
test <- quantile(test.data, probs=seq(0,1,0.00024));
test <- quantile(test.data, probs=seq(0,1,0.00025));
both ran fine. So, I'm baffled as to what the error actually is.
This is a result of numerical imprecision
(see FAQ 7.31), but I think it's not your fault.
set.seed(1001) ## for reproducibility, although not
## really necessary here
test.data <- rnorm(1000,0,1)
for (i in seq(0.00001, 0.001, 0.00001)){
test <- quantile(test.data, probs=seq(0,1,i))
print(i)
}
fails as reported
let's see what's going on --
zz <- seq(1e-5,1e-3,1e-5)
test <- quantile(test.data, probs=seq(0,1,2.5e-4)) ## OK
test <- quantile(test.data, probs=seq(0,1,zz[25])) ## not OK
zz[25]-2.5e-4
[1] 5.421011e-20
so let's look closer ...
zzz <- range(seq(0,1,2.5e-4))
zzz
zzz-c(0,1)
zzz2 <- range(seq(0,1,zz[25]))
zzz2
zzz2-c(0,1)
[1] 0.000000e+00 2.220446e-16
so the results of the seq() call exceed 1 by a little bit.
I think this may actually count as a bug in R (congratulations!)
the documentation for ?seq says.
## The second form generates 'from, from+by', ..., up to the sequence
## value LESS THAN OR EQUAL TO 'TO'. Specifying 'to - from' and 'by'
## of opposite signs is an error.
[emphasis added]
1 + 2.220446e-16 (which is the maximum of the
sequence that 'seq' generates) IS NOT <= 1 ...
In the meantime, here's a workaround ...
for (i in seq(0.00001, 0.001, 0.00001)){
test <- quantile(test.data, probs=pmin(seq(0,1,i),1))
print(i)
}
If I don't hear otherwise I will submit this as
a bug to r-devel ...