Michael Friendly wrote: > Barrett & Ling, JASA, 1992, v.87(417), pp184-191 define general classes > of influence measures for multivariate > regression models, including analogs of Cook's D, Andrews & Pregibon > COVRATIO, etc. As in univariate > response models, these are based on leverage and residuals based on > omitting one (or more) observations at > a time and refitting, although, in the univariate case, the computations > can be optimized, as they are in > stats::influence() and related methods. > > I'm interested in exploring the multivariate extension in R. I tried > the following, and was surprised to find that > R returned a result rather than an error -- presumably because mlm > objects are not trapped before they > get to lm.influence() > > > # multivariate model > > data(Rohwer, package="heplots") > > rohwer.mod <- lm(cbind(SAT, PPVT, Raven) ~ n + s + ns + na + ss, > data=Rohwer) > > > names(influence(rohwer.mod)) > [1] "hat" "coefficients" "sigma" "wt.res" > > head(influence(rohwer.mod)$coefficients, 2) > [,1] [,2] [,3] [,4] [,5] [,6] > [1,] 2.25039 0.0254739 -0.025252 -0.06297 -0.121507 0.094355 > [2,] 0.84649 -0.0062656 -0.077430 0.08345 -0.022579 -0.059480 > > > > Of course, the correct calculations would result from refitting, > omitting each observation in turn, though doing this > directly would be horribly inefficient. > e.g, calculating B(i), deleting case i: > > > coef(update(rohwer.mod, subset=1:69 !=1, data=Rohwer)) > SAT PPVT Raven > (Intercept) -2.466079 35.68664 11.510068 > n 1.888286 0.60949 0.075931 > s -0.034524 -0.53040 0.160328 > ns -2.739834 -0.67355 0.066392 > na 2.219340 1.20481 -0.037272 > ss 1.072300 0.99033 0.058509 > > coef(update(rohwer.mod, subset=1:69 !=2, data=Rohwer)) > SAT PPVT Raven > (Intercept) -1.062178 33.88199 10.8988006 > n 1.920026 0.59735 0.0713976 > s 0.017654 -0.47464 0.1774135 > ns -2.886254 -0.67905 0.0673686 > na 2.120411 1.29016 -0.0077484 > ss 1.226135 0.96430 0.0471764 > > Is there anything existing for this case that I've missed, or does > anyone have an interest in pursuing this topic?
Hmm, fitted coefficients in this sort multivariate models are the same as those in the univariate ones, so as long as you do whole-case deletions, I would think that you should be able to reuse the 1D code. I would conjecture that the main problem with what you currently get is that it only pertains to the 1st column -- looks like the differences between the two rows from lm.influence matches the differences between the first two colums from coef(update(...)). Since lm() only handles complete cases, casewise deletion diagnostics is probably the best you can get, otherwise it would be interesting to see the effect of deleting each coordinate separately. (As you know, these matters are within my general sphere of interest, but I'm afraid my time is too constrained at them moment for more than a sideline view.) > > -Michael > -- Peter Dalgaard Center for Statistics, Copenhagen Business School Phone: (+45)38153501 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.