Hello Tyler, I want to bring to your attention the following document: "What happens if you omit the main effect in a regression model with an interaction?" (https://stats.idre.ucla.edu/stata/faq/what-happens-if-you-omit-the-main-effect-in-a-regression-model-with-an-interaction). This gives a useful review of the problem. Your example is Case 2: a continuous and a categorical regressor.
The numerical examples are coded in Stata, and they give the same result as in R. Hence, if this is a bug in R then it is also a bug in Stata. That seems very unlikely. Here is a simulation in R of the above mentioned Case 2 in Stata: df <- expand.grid(socst=c(-1:1),grp=c("1","2","3","4")) print("Full model") print(model.matrix(~(socst+grp)^2 ,data=df)) print("Example 2.1: drop socst") print(model.matrix(~(socst+grp)^2 -socst ,data=df)) print("Example 2.2: drop grp") print(model.matrix(~(socst+grp)^2 -grp ,data=df)) This gives indeed the following regressors: "Full model" (Intercept) socst grp2 grp3 grp4 socst:grp2 socst:grp3 socst:grp4 "Example 2.1: drop socst" (Intercept) grp2 grp3 grp4 socst:grp1 socst:grp2 socst:grp3 socst:grp4 "Example 2.2: drop grp" (Intercept) socst socst:grp2 socst:grp3 socst:grp4 There is a little bit of R documentation about this, based on the concept of marginality, which typically forbids a model having an interaction but not the corresponding main effects. (You might see the references in https://en.wikipedia.org/wiki/Principle_of_marginality ) See "An Introduction to R", by Venables and Smith and the R Core Team. At the bottom of page 52 (PDF: 57) it says: "Although the details are complicated, model formulae in R will normally generate the models that an expert statistician would expect, provided that marginality is preserved. Fitting, for [a contrary] example, a model with an interaction but not the corresponding main effects will in general lead to surprising results ....". The Reference Manual states that the R functions dropterm() and addterm() resp. drop or add only terms such that marginality is preserved. Finally, about your singular matrix t(mm)%*%mm. This is in fact Example 2.1 in Case 2 discussed above. As discussed there, in Stata and in R the drop of the continuous variable has no effect on the degrees of freedom here: it is just a reparameterisation of the full model, protecting you against losing marginality... Hence the model.matrix 'mm' is still square and nonsingular after the drop of X1, unless of course when a row is removed from the matrix 'design' when before creating 'mm'. Arie On Sun, Oct 15, 2017 at 7:05 PM, Tyler <tyle...@gmail.com> wrote: > You could possibly try to explain away the behavior for a missing main > effects term, since without the main effects term we don't have main effect > columns in the model matrix used to compute the interaction columns (At > best this is undocumented behavior--I still think it's a bug, as we know > how we would encode the categorical factors if they were in fact present. > It's either specified in contrasts.arg or using the default set in > options). However, when all the main effects are present, why would the > three-factor interaction column not simply be the product of the main > effect columns? In my example: we know X1, we know X2, and we know X3. Why > does the encoding of X1:X2:X3 depend on whether we specified a two-factor > interaction, AND only changes for specific missing interactions? > > In addition, I can use a two-term example similar to yours to show how this > behavior results in a singular covariance matrix when, given the desired > factor encoding, it should not be singular. > > We start with a full factorial design for a two-level continuous factor and > a three-level categorical factor, and remove a single row. This design > matrix does not leave enough degrees of freedom to determine > goodness-of-fit, but should allow us to obtain parameter estimates. > >> design = expand.grid(X1=c(1,-1),X2=c("A","B","C")) >> design = design[-1,] >> design > X1 X2 > 2 -1 A > 3 1 B > 4 -1 B > 5 1 C > 6 -1 C > > Here, we first calculate the model matrix for the full model, and then > manually remove the X1 column from the model matrix. This gives us the > model matrix one would expect if X1 were removed from the model. We then > successfully calculate the covariance matrix. > >> mm = model.matrix(~(X1+X2)^2,data=design) >> mm > (Intercept) X1 X2B X2C X1:X2B X1:X2C > 2 1 -1 0 0 0 0 > 3 1 1 1 0 1 0 > 4 1 -1 1 0 -1 0 > 5 1 1 0 1 0 1 > 6 1 -1 0 1 0 -1 > >> mm = mm[,-2] >> solve(t(mm) %*% mm) > (Intercept) X2B X2C X1:X2B X1:X2C > (Intercept) 1 -1.0 -1.0 0.0 0.0 > X2B -1 1.5 1.0 0.0 0.0 > X2C -1 1.0 1.5 0.0 0.0 > X1:X2B 0 0.0 0.0 0.5 0.0 > X1:X2C 0 0.0 0.0 0.0 0.5 > > Here, we see the actual behavior for model.matrix. The undesired re-coding > of the model matrix interaction term makes the information matrix singular. > >> mm = model.matrix(~(X1+X2)^2-X1,data=design) >> mm > (Intercept) X2B X2C X1:X2A X1:X2B X1:X2C > 2 1 0 0 -1 0 0 > 3 1 1 0 0 1 0 > 4 1 1 0 0 -1 0 > 5 1 0 1 0 0 1 > 6 1 0 1 0 0 -1 > >> solve(t(mm) %*% mm) > Error in solve.default(t(mm) %*% mm) : system is computationally singular: > reciprocal condition number = 5.55112e-18 > > I still believe this is a bug. > > Best regards, > Tyler Morgan-Wall > > On Sun, Oct 15, 2017 at 1:49 AM, Arie ten Cate <arietenc...@gmail.com> > wrote: > >> I think it is not a bug. It is a general property of interactions. >> This property is best observed if all variables are factors >> (qualitative). >> >> For example, you have three variables (factors). You ask for as many >> interactions as possible, except an interaction term between two >> particular variables. When this interaction is not a constant, it is >> different for different values of the remaining variable. More >> precisely: for all values of that variable. In other words: you have a >> three-way interaction, with all values of that variable. >> >> An even smaller example is the following script with only two >> variables, each being a factor: >> >> df <- expand.grid(X1=c("p","q"), X2=c("A","B","C")) >> print(model.matrix(~(X1+X2)^2 ,data=df)) >> print(model.matrix(~(X1+X2)^2 -X1,data=df)) >> print(model.matrix(~(X1+X2)^2 -X2,data=df)) >> >> The result is: >> >> (Intercept) X1q X2B X2C X1q:X2B X1q:X2C >> 1 1 0 0 0 0 0 >> 2 1 1 0 0 0 0 >> 3 1 0 1 0 0 0 >> 4 1 1 1 0 1 0 >> 5 1 0 0 1 0 0 >> 6 1 1 0 1 0 1 >> >> (Intercept) X2B X2C X1q:X2A X1q:X2B X1q:X2C >> 1 1 0 0 0 0 0 >> 2 1 0 0 1 0 0 >> 3 1 1 0 0 0 0 >> 4 1 1 0 0 1 0 >> 5 1 0 1 0 0 0 >> 6 1 0 1 0 0 1 >> >> (Intercept) X1q X1p:X2B X1q:X2B X1p:X2C X1q:X2C >> 1 1 0 0 0 0 0 >> 2 1 1 0 0 0 0 >> 3 1 0 1 0 0 0 >> 4 1 1 0 1 0 0 >> 5 1 0 0 0 1 0 >> 6 1 1 0 0 0 1 >> >> Thus, in the second result, we have no main effect of X1. Instead, the >> effect of X1 depends on the value of X2; either A or B or C. In fact, >> this is a two-way interaction, including all three values of X2. In >> the third result, we have no main effect of X2, The effect of X2 >> depends on the value of X1; either p or q. >> >> A complicating element with your example seems to be that your X1 and >> X2 are not factors. >> >> Arie >> >> On Thu, Oct 12, 2017 at 7:12 PM, Tyler <tyle...@gmail.com> wrote: >> > Hi, >> > >> > I recently ran into an inconsistency in the way model.matrix.default >> > handles factor encoding for higher level interactions with categorical >> > variables when the full hierarchy of effects is not present. Depending on >> > which lower level interactions are specified, the factor encoding changes >> > for a higher level interaction. Consider the following minimal >> reproducible >> > example: >> > >> > -------------- >> > >> >> runmatrix = expand.grid(X1=c(1,-1),X2=c(1,-1),X3=c("A","B","C"))> >> model.matrix(~(X1+X2+X3)^3,data=runmatrix) (Intercept) X1 X2 X3B X3C >> X1:X2 X1:X3B X1:X3C X2:X3B X2:X3C X1:X2:X3B X1:X2:X3C >> > 1 1 1 1 0 0 1 0 0 0 0 >> > 0 0 >> > 2 1 -1 1 0 0 -1 0 0 0 0 >> > 0 0 >> > 3 1 1 -1 0 0 -1 0 0 0 0 >> > 0 0 >> > 4 1 -1 -1 0 0 1 0 0 0 0 >> > 0 0 >> > 5 1 1 1 1 0 1 1 0 1 0 >> > 1 0 >> > 6 1 -1 1 1 0 -1 -1 0 1 0 >> > -1 0 >> > 7 1 1 -1 1 0 -1 1 0 -1 0 >> > -1 0 >> > 8 1 -1 -1 1 0 1 -1 0 -1 0 >> > 1 0 >> > 9 1 1 1 0 1 1 0 1 0 1 >> > 0 1 >> > 10 1 -1 1 0 1 -1 0 -1 0 1 >> > 0 -1 >> > 11 1 1 -1 0 1 -1 0 1 0 -1 >> > 0 -1 >> > 12 1 -1 -1 0 1 1 0 -1 0 -1 >> > 0 1 >> > attr(,"assign") >> > [1] 0 1 2 3 3 4 5 5 6 6 7 7 >> > attr(,"contrasts") >> > attr(,"contrasts")$X3 >> > [1] "contr.treatment" >> > >> > -------------- >> > >> > Specifying the full hierarchy gives us what we expect: the interaction >> > columns are simply calculated from the product of the main effect >> columns. >> > The interaction term X1:X2:X3 gives us two columns in the model matrix, >> > X1:X2:X3B and X1:X2:X3C, matching the products of the main effects. >> > >> > If we remove either the X2:X3 interaction or the X1:X3 interaction, we >> get >> > what we would expect for the X1:X2:X3 interaction, but when we remove the >> > X1:X2 interaction the encoding for X1:X2:X3 changes completely: >> > >> > -------------- >> > >> >> model.matrix(~(X1+X2+X3)^3-X1:X3,data=runmatrix) (Intercept) X1 X2 >> X3B X3C X1:X2 X2:X3B X2:X3C X1:X2:X3B X1:X2:X3C >> > 1 1 1 1 0 0 1 0 0 0 0 >> > 2 1 -1 1 0 0 -1 0 0 0 0 >> > 3 1 1 -1 0 0 -1 0 0 0 0 >> > 4 1 -1 -1 0 0 1 0 0 0 0 >> > 5 1 1 1 1 0 1 1 0 1 0 >> > 6 1 -1 1 1 0 -1 1 0 -1 0 >> > 7 1 1 -1 1 0 -1 -1 0 -1 0 >> > 8 1 -1 -1 1 0 1 -1 0 1 0 >> > 9 1 1 1 0 1 1 0 1 0 1 >> > 10 1 -1 1 0 1 -1 0 1 0 -1 >> > 11 1 1 -1 0 1 -1 0 -1 0 -1 >> > 12 1 -1 -1 0 1 1 0 -1 0 1 >> > attr(,"assign") >> > [1] 0 1 2 3 3 4 5 5 6 6 >> > attr(,"contrasts") >> > attr(,"contrasts")$X3 >> > [1] "contr.treatment" >> > >> > >> > >> >> model.matrix(~(X1+X2+X3)^3-X2:X3,data=runmatrix) (Intercept) X1 X2 >> X3B X3C X1:X2 X1:X3B X1:X3C X1:X2:X3B X1:X2:X3C >> > 1 1 1 1 0 0 1 0 0 0 0 >> > 2 1 -1 1 0 0 -1 0 0 0 0 >> > 3 1 1 -1 0 0 -1 0 0 0 0 >> > 4 1 -1 -1 0 0 1 0 0 0 0 >> > 5 1 1 1 1 0 1 1 0 1 0 >> > 6 1 -1 1 1 0 -1 -1 0 -1 0 >> > 7 1 1 -1 1 0 -1 1 0 -1 0 >> > 8 1 -1 -1 1 0 1 -1 0 1 0 >> > 9 1 1 1 0 1 1 0 1 0 1 >> > 10 1 -1 1 0 1 -1 0 -1 0 -1 >> > 11 1 1 -1 0 1 -1 0 1 0 -1 >> > 12 1 -1 -1 0 1 1 0 -1 0 1 >> > attr(,"assign") >> > [1] 0 1 2 3 3 4 5 5 6 6 >> > attr(,"contrasts") >> > attr(,"contrasts")$X3 >> > [1] "contr.treatment" >> > >> > >> >> model.matrix(~(X1+X2+X3)^3-X1:X2,data=runmatrix) (Intercept) X1 X2 >> X3B X3C X1:X3B X1:X3C X2:X3B X2:X3C X1:X2:X3A X1:X2:X3B X1:X2:X3C >> > 1 1 1 1 0 0 0 0 0 0 1 >> > 0 0 >> > 2 1 -1 1 0 0 0 0 0 0 -1 >> > 0 0 >> > 3 1 1 -1 0 0 0 0 0 0 -1 >> > 0 0 >> > 4 1 -1 -1 0 0 0 0 0 0 1 >> > 0 0 >> > 5 1 1 1 1 0 1 0 1 0 0 >> > 1 0 >> > 6 1 -1 1 1 0 -1 0 1 0 0 >> > -1 0 >> > 7 1 1 -1 1 0 1 0 -1 0 0 >> > -1 0 >> > 8 1 -1 -1 1 0 -1 0 -1 0 0 >> > 1 0 >> > 9 1 1 1 0 1 0 1 0 1 0 >> > 0 1 >> > 10 1 -1 1 0 1 0 -1 0 1 0 >> > 0 -1 >> > 11 1 1 -1 0 1 0 1 0 -1 0 >> > 0 -1 >> > 12 1 -1 -1 0 1 0 -1 0 -1 0 >> > 0 1 >> > attr(,"assign") >> > [1] 0 1 2 3 3 4 4 5 5 6 6 6 >> > attr(,"contrasts") >> > attr(,"contrasts")$X3 >> > [1] "contr.treatment" >> > >> > -------------- >> > >> > Here, we now see the encoding for the interaction X1:X2:X3 is now the >> > interaction of X1 and X2 with a new encoding for X3 where each factor >> level >> > is represented by its own column. I would expect, given the two column >> > dummy variable encoding for X3, that the X1:X2:X3 column would also be >> two >> > columns regardless of what two-factor interactions we also specified, but >> > in this case it switches to three. If other two factor interactions are >> > missing in addition to X1:X2, this issue still occurs. This also happens >> > regardless of the contrast specified in contrasts.arg for X3. I don't see >> > any reasoning for this behavior given in the documentation, so I suspect >> it >> > is a bug. >> > >> > Best regards, >> > Tyler Morgan-Wall >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-devel@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-devel >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel