Hi all, I wanted to mark the estimation sample: mark what rows (observations) are deleted by lm due to missingness. For eg, from the original example in help, I have changed one of the values in trt to be NA (missing).
# code below # ---- # original example > ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14) > trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69) # change 18th observation of trt > trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,NA,4.32,4.69) > group <- gl(2,10,20, labels=c("Ctl","Trt")) > weight <- c(ctl, trt) > lm.D9 <- lm(weight ~ group) > summary(lm.D9) Call: lm(formula = weight ~ group) Residuals: Min 1Q Median 3Q Max -1.04556 -0.48378 0.05444 0.23622 1.39444 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 5.0320 0.2258 22.281 5.09e-14 *** groupTrt -0.3964 0.3281 -1.208 0.244 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.7142 on 17 degrees of freedom (1 observation deleted due to missingness) Multiple R-squared: 0.07907, Adjusted R-squared: 0.0249 F-statistic: 1.46 on 1 and 17 DF, p-value: 0.2435 # ------ # end snippet I want to generate an indicator variable to mark the observations used in estimation: 1 for a row not deleted, 0 for a row deleted. In this case I want an indicator variable that has seventeen 1s, one 0, and then 2 1s. I know I can do ind = !is.na(group) in the above example. But I am ideally looking for a way that allows one to use any formula in lm, and still be able to mark the estimation sample. Function/option I am missing? The best I could come up with: > lm.D9 <- lm(weight ~ group, model=TRUE) > ind <- as.numeric(row.names(lm.D9$model)) > esamp <- rep(0,length(group)) #substitute nrow(data.frame used in estimation) > for length(group) > esamp[ind] <- 1 > esamp [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 Is this "safe" (recommended?)? Appreciate any help. Best, A ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.