Re: [R] Interaction term in multiple regression

David Winsemius Mon, 13 Jul 2009 21:04:09 -0700


On Jul 13, 2009, at 9:31 PM, kfort...@email.unc.edu wrote:

Hello All, Thank you for taking my question. I am looking forinformation on how R handles interaction terms in a multipleregression using the “lm” command. I originally noticed somethingwas unusual when my R output did not match the output from JMP foran identical test run previously. Both programs give identicalresults for the main test and if the models do not contain theinteraction term then the output is identical. However the resultsof the partial F tests differ dramatically when the interaction termis included.

The interpretation the coefficients and partial F-tests for individualterms of a model involving interactions is at the very leastdifficult, and I have been advised by my statistical betters simply tonot to attempt it. Compare the differences between overall modelstatistics instead, and while paying careful attention to the codingof terms, create predictions for combinations of variables.

Here are the results from R of the test with the interaction:
summary(lm(TD[Year==2007]~Kd[Year==2007]*area[Year==2007],data=boon_tot))
Call:
lm(formula = TD[Year == 2007] ~ Kd[Year == 2007] * area[Year ==2007], data = boon_tot)
Residuals:
   Min       1Q   Median       3Q      Max

-0.42696 -0.25648 -0.11960  0.03151  1.27957

Coefficients:
                                  Estimate Std. Error t value Pr(>|t|)

(Intercept) 5.5714 1.7995 3.0960.0148 *Kd[Year == 2007] 0.2867 4.0696 0.0700.9456area[Year == 2007] 0.8192 0.2874 2.8510.0215 *Kd[Year == 2007]:area[Year == 2007] -1.8074 0.6320 -2.8600.0211 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.5238 on 8 degrees of freedom
Multiple R-squared: 0.6826, Adjusted R-squared: 0.5636 F-statistic: 5.736 on 3 and 8 DF, p-value: 0.02155
Here are the results from JMP for the same model

Source          df      SS              MS              F               p
Model           3       4.72157318      1.57385773      5.73591141  0.02155127
Error           8       2.19509349      0.27438669
C. Total        11      6.91666667

Source                  Est.            Std Error       t value p > t
Intercept               10.4933505      1.24016642      8.46124381      
0.00002911
Kd                      -11.213166      2.95096414      -3.7998315      
0.00523792
area (ha)               0.04560254      0.03069489      1.48567197      
0.17567049
(Kd-0.428)*

     ^^^^

(area (ha)-6.3625)      -1.8074455      0.63195669      -2.860078       
0.02114887

          ^^^^^

This suggests that JMP has automatically centered the variables priorto forming the interaction term. What's not so clear is whether theother terms may have been centered as well.

As you can see although the results of the main test and theinteraction term are identical, the estimate and std error of theother factors are very different.

The real question would be whether they give identical predictions andwhat the difference between model statistics show when the more simplemodels are compared with the more complex. You have not yet looked atthis question in detail although the information is available in theoutputs alluded to below..

Additionally if I remove the interaction term from the model, thetwo programs then give identical results.


Then JMP must be give acceptable computations, I suppose.


Any thoughts as to why they differ would be appreciated.

Different codings of the variables in the interaction models. Perhapsyou couldcreate a variable that resembles the JMP interaction term andsee if that is confirmed, or you could review the respective manualsregarding interactions.


--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Interaction term in multiple regression

Reply via email to