If there's been an answer to this, I've missed it.
Here's my take.

Antje wrote:
Hi there,

I was wondering if anybody can explain to me why the boxplot ends up with different results in the following case:

I have some integer data as a vector and I compare the stats of boxplot with the same data divided by a factor.

I've attached a csv file with both data present (d1, d2). The factor is 34.16667.

If I run the boxplot function on d1 I get the following stats:

0.848...
0.907...
0.936...
0.965...
1.024...

For d2 I get these stats:

29
31
32
33
36


If I convert the stats of d1 with the factor, I get

29
31
32
33
35

Obviously different for the upper whisker. But why???

Antje

Antje:

Three comments:
1. I think your 'factor' is actually 205/6, not 34.16667.

2. This looks like another case of FAQ 7.31:

# Let's take your d2 and create d1; I'll call them x and y:
x <- rep(c(29:38, 40), c(7, 24, 50, 71, 24, 12, 14, 7, 13, 5, 1))
y <- x * 6 / 205

# x is your d2, sorted
# y is your d1, sorted

# The critical values are x[202:203] and y[202:203];
x[201:204]
#[1] 35 35 36 36

# The boxplot stats are:
sx <- boxplot.stats(x)$stats
sy <- boxplot.stats(y)$stats

# Calculate potential extent of upper whisker:
ux <- sx[4] + (sx[4] - sx[2]) * 1.5  #36
uy <- sy[4] + (sy[4] - sy[2]) * 1.5  #1.053658536585366

# Is y[203] <= uy?
y[203] <= uy
#[1] FALSE  #!!!

y[202] <= uy
#[1] TRUE

# For x:
x[203] <= ux
#[1] TRUE

And there's your answer: for y the whisker
goes to y[202], not y[203], due to the inevitable
imprecision in machine calculation.

3. last comment: I would not use boxplots for data like this.

 -Peter Ehlers




------------------------------------------------------------------------

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to