On 25.02.2012 19:16, Paul Johnson wrote:
Hello, Everybody: This may not be a "bug", but for me it is an unexpected outcome. A factor variable's levels do not retain their ordering after the levels function is used. I supply an example in which a factor with values "BC" "AD" (in that order) is unintentionally re-alphabetized by the levels function. To me, this is very bad behavior. Would you agree? # Paul Johnson 2012-02-05 x<- c("AD","BC","AD","BC","AD","BC") xf<- factor(x, levels=c("BC", "AD"), labels=c("Before Christ","After Christ")) y<- rnorm(6) m1<- lm (y ~ xf ) plot(y ~ xf) abline (m1) ## Just a little problem the line does not "go through" the box ## plot in the right spot because contrasts(xf) is 0,1 but ## the plot uses xf in 1,2. xlevels<- levels(xf) newdf<- data.frame(xf=xlevels) ypred<- predict(m1, newdata=newdf) ##Watch now: the plot comes out "reversed", AC before BC plot(ypred ~ newdf$xf) ## Ah. Now I see: levels(newdf$xf) ## Why doesnt newdf$xf respect the ordering of the levels?
Because xlevels was a character and you coerced it to a factor by calling data.frame(xf=xlevels) on it without telling anything about the orderiung, hence it got sorted lexicographically.
Uwe Ligges
______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel