If there's been an answer to this, I've missed it.
Here's my take.
Antje wrote:
Hi there,
I was wondering if anybody can explain to me why the boxplot ends up
with different results in the following case:
I have some integer data as a vector and I compare the stats of boxplot
with the same data divided by a factor.
I've attached a csv file with both data present (d1, d2). The factor is
34.16667.
If I run the boxplot function on d1 I get the following stats:
0.848...
0.907...
0.936...
0.965...
1.024...
For d2 I get these stats:
29
31
32
33
36
If I convert the stats of d1 with the factor, I get
29
31
32
33
35
Obviously different for the upper whisker. But why???
Antje
Antje:
Three comments:
1. I think your 'factor' is actually 205/6, not 34.16667.
2. This looks like another case of FAQ 7.31:
# Let's take your d2 and create d1; I'll call them x and y:
x <- rep(c(29:38, 40), c(7, 24, 50, 71, 24, 12, 14, 7, 13, 5, 1))
y <- x * 6 / 205
# x is your d2, sorted
# y is your d1, sorted
# The critical values are x[202:203] and y[202:203];
x[201:204]
#[1] 35 35 36 36
# The boxplot stats are:
sx <- boxplot.stats(x)$stats
sy <- boxplot.stats(y)$stats
# Calculate potential extent of upper whisker:
ux <- sx[4] + (sx[4] - sx[2]) * 1.5 #36
uy <- sy[4] + (sy[4] - sy[2]) * 1.5 #1.053658536585366
# Is y[203] <= uy?
y[203] <= uy
#[1] FALSE #!!!
y[202] <= uy
#[1] TRUE
# For x:
x[203] <= ux
#[1] TRUE
And there's your answer: for y the whisker
goes to y[202], not y[203], due to the inevitable
imprecision in machine calculation.
3. last comment: I would not use boxplots for data like this.
-Peter Ehlers
------------------------------------------------------------------------
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.