Suppressing the intercept and contr.sum coding are not quite working as I expect: > mf <- data.frame(A=C(factor(c("a", "b", "c")), contr.sum)) > mm <- model.matrix(~0+A, data=mf) > mm Aa Ab Ac 1 1 0 0 2 0 1 0 3 0 0 1
What I expect (and want) is A1 A2 1 1 0 2 0 1 3 1 1 When I do more complicated models every term except the first one is coded as expected. That includes A itself if interacted with other variables. It seems R has decided the model really needs an intercept and is throwing in an extra level for the first factor to assure that I get it, even though I said with the "0" that I didn't want it. BTW, ~A produces an intercept and the two columns expected above. But I don't want the intercept; the model matrix is going into a multinomial model for which the intercept is not identified (since all intercepts produce the same predicted probabilities). What's going on here? R 2.15.1 P.S. I think the above stripped down example illustrates the problem, but here's a more expanded model: > mf <- expand.grid(C(factor(c("a", "b", "c")), contr.sum), + C(factor(c("f", "t")), contr.sum)) > colnames(mf) <- c("A", "H") > mf$x <- seq(6) > mf A H x 1 a f 1 2 b f 2 3 c f 3 4 a t 4 5 b t 5 6 c t 6 > myformula <- ~0+A*H*x > mm <- model.matrix(myformula, data=mf) > mm Aa Ab Ac H1 x A1:H1 A2:H1 A1:x A2:x H1:x A1:H1:x A2:H1:x 1 1 0 0 1 1 1 0 1 0 1 1 0 2 0 1 0 1 2 0 1 0 2 2 0 2 3 0 0 1 1 3 -1 -1 -3 -3 3 -3 -3 4 1 0 0 -1 4 -1 0 4 0 -4 -4 0 5 0 1 0 -1 5 0 -1 0 5 -5 0 -5 6 0 0 1 -1 6 1 1 -6 -6 -6 6 6 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.