Hello all,

I think I probably did something stupid, and R's part was to allow me to do it. 
 My goal was to control the order of factor levels appearing horizontally on a 
boxplot.  Enter search engines and perhaps some creative stupidity on my part, 
and I came up with the following:

        v=read.table("factor-order.txt",header=TRUE);
        levels(v$doseGroup) = c("L", "M", "H");
        boxplot(v$dose~v$doseGroup);


A good way to see the trap is to evaluate:

        v=read.table("factor-order.txt",header=TRUE);
        par(mfrow=c(2,1));
        boxplot(v$dose~v$doseGroup);
        levels(v$doseGroup) = c("L", "M", "H");
        boxplot(v$dose~v$doseGroup);
        par(mfrow=c(1,1));

The above creates two plots, one correct with the factors in an inconvient 
order, and one that is WRONG.  In the latter, the labels appear in the desired 
order, but the data does not "move with them."  I did not discover the problem 
until I repeated the same type of plot with something that had a known 
relationship with the levels, and the result was clearly not correct.

I *think* the problem is to assign to the return value of levels().  How did I 
think to do that?  I'm not really sure, but please look at

  https://stat.ethz.ch/pipermail/r-help/2008-August/171884.html


Perhaps it does not say to do exactly what I did, but it sure was easy to 
follow to the mistake, it appeared to do what I wanted, and the consequences of 
the mistake are ugly.  Perhaps levels() should return something that is 
immutable??  If I am looking at this correctly, levels() is an accident waiting 
to happen.

What should I have done?  It seems:

        read data and order factor levels
        v=read.table("factor-order.txt",header=TRUE);
        group = factor(v$doseGroup,levels = c("L", "M", "H") );
        boxplot(v$dose~group);


One disappointment is that the above factor() call apparently needs to be 
repeated for any subset of v - I'm still trying to get my mind around that one.

Can anyone confirm this?  It strikes me as a trap that should be addressed so 
that an error results rather than a garbage graph.

Bill


---
Wilhelm K. Schwab, Ph.D.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to