"Liaw, Andy" <[EMAIL PROTECTED]> writes: > The issue is not with boxplot, but with split. boxplot.formula() > calls boxplot(split(split(mf[[response]], mf[-response]), ...), > but look at what split() returns when there are empty levels in > the factor: > > > f <- factor(gl(3, 6), levels=1:5) > > y <- rnorm(f) > > split(y, f) > $"1" > [1] 0.4832124 1.1924811 0.3657797 1.7400198 0.5577356 0.9889520 > > $"2" > [1] -1.1296642 -0.4808355 -0.2789933 0.1220718 0.1287742 -0.7573801 > > $"3" > [1] 1.2320902 0.5090700 -1.5508074 2.1373780 1.1681297 -0.7151561 > > The "culprit" is the following in split.default(): > > f <- factor(f) > > which drops empty levels in f, if there are any. BTW, ?split doesn't > mention what it does in such situation. Perhaps it should? > > If this is to be "fixed", I suppose an additional argument, e.g., > drop=TRUE, can be added, and the corresponding line mentioned > above changed to something like: > > if (drop || !is.factor(f)) f <- factor(f) > > Then this additional argument can be pass on from boxplot.formula() to > split().
Alternatively, I suspect that the intention was as.factor() rather than factor(). It does require a bit of care to fix it that way, though. There could be problems with empty levels popping up in unexpected places. -- O__ ---- Peter Dalgaard Ă˜ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel