On Apr 30, 2010, at 4:57 PM, Erik Iverson wrote: > <snip> >> I'm sure it's not a bug, but could someone point to a thread or offer some >> gentle advice on what's happening? I think it's related to: >> test <- data.frame(name1 = 1:5, name2 = 6:10, test = 11:15) >> eval(expression(test[c("name1", "name2")])) >> eval(expression(interco[c("name1", "test")])) > > scratch that last one, obviously a typo was causing my confusion there! The > model.frame stuff remains a mystery to me though...
Hi Erik, It's late on a Friday, it's grey and raining here in Minneapolis and I am short on caffeine, but, that being said, consider the following :-) > working france manual famanual total working no 1 1 1 1 107 85 22 2 1 1 0 65 44 21 3 1 0 1 66 24 42 4 1 0 0 171 17 154 5 0 1 1 87 24 63 6 0 1 0 65 22 43 7 0 0 1 85 1 84 8 0 0 0 148 6 142 > as.matrix(working[c("working", "no")]) working no [1,] 85 22 [2,] 44 21 [3,] 24 42 [4,] 17 154 [5,] 24 63 [6,] 22 43 [7,] 1 84 [8,] 6 142 > with(working, as.matrix(working[c("working", "no")])) [,1] [1,] NA [2,] NA For the incantations of model.frame(), the formula terms are evaluated first within the scope of the data frame indicated for the 'data' argument. Thus, in the second case, I am asking for the as.matrix(...) call to be evaluated within the scope of the 'working' data frame, which returns a matrix with only two rows, one NA for each column that was asked for and not found, which is different than the number of rows in 'working', thus you get the error as soon as the 'france' column is evaluated in the formula to create the model frame: Error in model.frame.default(formula = as.matrix(working[c("working", : variable lengths differ (found for 'france') 2 rows in the response matrix versus 8 rows for 'france'... It is kind of like you are asking for: > as.matrix(working$working[c("working", "no")]) [,1] [1,] NA [2,] NA Now, try this: > with(working, matrix(c(working, no), ncol = 2)) [,1] [,2] [1,] 85 22 [2,] 44 21 [3,] 24 42 [4,] 17 154 [5,] 24 63 [6,] 22 43 [7,] 1 84 [8,] 6 142 and then: > summary(glm(matrix(c(working, no), ncol = 2) ~ france + manual + famanual, > data = working, family = binomial)) Call: glm(formula = matrix(c(working, no), ncol = 2) ~ france + manual + famanual, family = binomial, data = working) Deviance Residuals: 1 2 3 4 5 6 7 0.09316 -0.14108 2.38028 -1.91838 -1.48196 1.84993 -1.61864 8 1.16747 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -3.6902 0.2547 -14.489 < 2e-16 *** france 1.9474 0.2162 9.008 < 2e-16 *** manual 2.5199 0.2168 11.625 < 2e-16 *** famanual 0.5522 0.2017 2.738 0.00618 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 308.329 on 7 degrees of freedom Residual deviance: 18.976 on 4 degrees of freedom AIC: 60.162 Number of Fisher Scoring iterations: 4 Does that help top clarify? Regards, Marc Schwartz ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.