Hello dear R users!
I know this question is not strictly R-help, yet, maybe some of the guru's
in statistics can help me out.
I have a sample of data all from the same "population". Say my regression
equation is now this:
m1 <- lm(y ~ x1 + x2 + x3)
I also regress on
m2 <- lm(y ~ x1 + x2 + x3 + x4)
The thing is, that I want to study the effect of "information" x4.
I would hypothesize, that the coefficient estimate for x1 goes down as I
introduce x4, as x4 conveys some of the information conveyed by x1 (but not
only). Of course x1 and x4 are correlated, however multicollinearity does
not appear to be a problem, the variance inflation factors are rather low
(around 1.5 or so).
I want to basically study, how the interplay between x1 and x4 is, when
introducing x4 into the regression equation and whether my hypothesis is
correct; i.e. that given I consider the information x4, not so much of the
variation is explained via x1 anymore.
I observe that introducing x4 into the regression, the coefficient estimate
for x1 goes down; also the associated p-value becomes bigger; i.e. x1
becomes comparatively less significant. However, x4 is not significant. Yet,
the observation is in line with my theoretical argument.
The question is now simple: how can I work this out?
I know this is likely a dumb question, but I would really appreciate some
links or help.
Regards
Thiemo
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.