Hello,

I have been using the QQ-Plot functions in R for a while now and I noticed
what I believe is a major inconsistency in the way R estimates and plots
the quantiles of a data set. To see this, let x be a vector of data (or
just simulate some data) say 15 points. Then type

qqnorm(x)

By default, R will estimate the sample quantiles using qtype = 5 which is
p[k] = (k-0.5)/n which is fine for now. Next, if we want to plot a line
through the first and third quantiles we would type

qqline(x, datax = FALSE, distribution = qnorm, probs = c(0.25,0.75))

which by default uses qtype = 7 when estimating the quantiles for the
sample!!!

To convince yourself it uses qtype = 7 just type the previous command line
and use a , qtype = 7 as follows

qqline(x, datax = FALSE, distribution = qnorm, probs = c(0.25,0.75), qtype
= 7)

which is the same line that was plotted before.

The fact that it estimates the sample quantiles using 2 different methods
and plots them on the same graph doesn't make any sense to me. For example,
if I wanted to draw a line through the first and last quantiles in my
sample using a particular qtype say qtype = 6, I would calculate the
corresponding probabilities using qytpe = 6 which is p[k] = k/(n+1) and
enter the following command

qqline(x, datax = FALSE, distribution = qnorm, probs = c(1/16,15/16), qtype
= 6)

which doesn't pass through the first and last quantiles since it was
originally plotted using qtype = 5. To see that R plotted this using qtype
= 5, write the above command using qtype = 5. This is p[k] = (k-0.5)/n and
therefore the first and last quantiles are 0.5/15 and 14.5/15 respectively.
The command is the following

qqline(x, datax = FALSE, distribution = qnorm, probs = c(0.5/15,14.5/15),
qtype = 5)

and this line should run directly through the first and last sample
quantiles as originally desired.

Why doesn't R re-plot the data using the specified qtype when drawing the
qqline?

or

Why would you want to view a plot that used different methods for
estimating quantiles?

I completely understand why we would want to use different quantile
estimators, especially if we want an unbiased (or appoximately) estimator
in some sense (qype 8 and 9). However, using 2 different estimators on the
same plot could be dangerous.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to