On 26-Aug-08 23:49:37, hadley wickham wrote: > On Tue, Aug 26, 2008 at 6:45 PM, Ted Harding > <[EMAIL PROTECTED]> wrote: >> Hi Folks, >> This tip is probably lurking somewhere already, but I've just >> discovered it the hard way, so it is probably worth passing >> on for the benefit of those who might otherwise hack their >> way along the same path. >> >> Say (for example) you want to do a logistic regression of a >> binary response Y on variables X1, X2, X3, X4: >> >> GLM <- glm(Y ~ X1 + X2 + X3 + X4) >> >> Say there are 1000 cases in the data. Because of missing values >> (NAs) in the variables, the number of complete cases retained >> for the regression is, say, 600. glm() does this automatically. >> >> QUESTION: Which cases are they? >> >> You can of course find out "by hand" on the lines of >> >> ix <- which( (!is.na(Y))&(!is.na(X1))&...&(!is.na(X4)) ) >> >> but one feels that GLM already knows -- so how to get it to talk? >> >> ANSWER: (e.g.) >> >> ix <- as.integer(names(GLM$fit)) > > Alternatively, you can use: > > attr(GLM$model, "na.action") > > Hadley
Thanks! I can see that it works -- though understanding how requires a deeper knowledge of "R internals". However, since you've approached it from that direction, simply GLM$model is a dataframe of the retained cases (with corresponding row-names), all variables at once, and that is possibly an even simpler approach! Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <[EMAIL PROTECTED]> Fax-to-email: +44 (0)870 094 0861 Date: 27-Aug-08 Time: 01:31:46 ------------------------------ XFMail ------------------------------ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.