Hello dear R users! I know this question is not strictly R-help, yet, maybe some of the guru's in statistics can help me out.
I have a sample of data all from the same "population". Say my regression equation is now this: m1 <- lm(y ~ x1 + x2 + x3) I also regress on m2 <- lm(y ~ x1 + x2 + x3 + x4) The thing is, that I want to study the effect of "information" x4. I would hypothesize, that the coefficient estimate for x1 goes down as I introduce x4, as x4 conveys some of the information conveyed by x1 (but not only). Of course x1 and x4 are correlated, however multicollinearity does not appear to be a problem, the variance inflation factors are rather low (around 1.5 or so). I want to basically study, how the interplay between x1 and x4 is, when introducing x4 into the regression equation and whether my hypothesis is correct; i.e. that given I consider the information x4, not so much of the variation is explained via x1 anymore. I observe that introducing x4 into the regression, the coefficient estimate for x1 goes down; also the associated p-value becomes bigger; i.e. x1 becomes comparatively less significant. However, x4 is not significant. Yet, the observation is in line with my theoretical argument. The question is now simple: how can I work this out? I know this is likely a dumb question, but I would really appreciate some links or help. Regards Thiemo [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.