Dear Gabor, I used str() to look at the two objects but missed the difference that you found. What I didn't quite understand was why one model worked but not the other when both were defined at the command prompt in the global environment.
Thanks, John -------------------------------- John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox > -----Original Message----- > From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On > Behalf Of Gabor Grothendieck > Sent: January-04-11 6:56 PM > To: John Fox > Cc: Sanford Weisberg; r-devel@r-project.org > Subject: Re: [Rd] scoping/non-standard evaluation issue > > On Tue, Jan 4, 2011 at 4:35 PM, John Fox <j...@mcmaster.ca> wrote: > > Dear r-devel list members, > > > > On a couple of occasions I've encountered the issue illustrated by the > > following examples: > > > > --------- snip ----------- > > > >> mod.1 <- lm(Employed ~ GNP.deflator + GNP + Unemployed + > > + Armed.Forces + Population + Year, data=longley) > > > >> mod.2 <- update(mod.1, . ~ . - Year + Year) > > > >> all.equal(mod.1, mod.2) > > [1] TRUE > >> > >> f <- function(mod){ > > + subs <- 1:10 > > + update(mod, subset=subs) > > + } > > > >> f(mod.1) > > > > Call: > > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > > Population + Year, data = longley, subset = subs) > > > > Coefficients: > > (Intercept) GNP.deflator GNP Unemployed Armed.Forces > > 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 > > Population Year > > 1.164e+00 -1.911e+00 > > > >> f(mod.2) > > Error in eval(expr, envir, enclos) : object 'subs' not found > > > > --------- snip ----------- > > > > I *almost* understand what's going -- that is, clearly mod.1 and mod.2, or > > the formulas therein, are associated with different environments, but I > > don't quite see why. > > > > Anyway, here are two "solutions" that work, but neither is in my view > > desirable: > > > > --------- snip ----------- > > > >> f1 <- function(mod){ > > + assign(".subs", 1:10, envir=.GlobalEnv) > > + on.exit(remove(".subs", envir=.GlobalEnv)) > > + update(mod, subset=.subs) > > + } > > > >> f1(mod.1) > > > > Call: > > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > > Population + Year, data = longley, subset = .subs) > > > > Coefficients: > > (Intercept) GNP.deflator GNP Unemployed Armed.Forces > > 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 > > Population Year > > 1.164e+00 -1.911e+00 > > > >> f1(mod.2) > > > > Call: > > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > > Population + Year, data = longley, subset = .subs) > > > > Coefficients: > > (Intercept) GNP.deflator GNP Unemployed Armed.Forces > > 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 > > Population Year > > 1.164e+00 -1.911e+00 > > > >> f2 <- function(mod){ > > + env <- new.env(parent=.GlobalEnv) > > + attach(NULL) > > + on.exit(detach()) > > + assign(".subs", 1:10, pos=2) > > + update(mod, subset=.subs) > > + } > > > >> f2(mod.1) > > > > Call: > > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > > Population + Year, data = longley, subset = .subs) > > > > Coefficients: > > (Intercept) GNP.deflator GNP Unemployed Armed.Forces > > 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 > > Population Year > > 1.164e+00 -1.911e+00 > > > >> f2(mod.2) > > > > Call: > > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > > Population + Year, data = longley, subset = .subs) > > > > Coefficients: > > (Intercept) GNP.deflator GNP Unemployed Armed.Forces > > 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 > > Population Year > > 1.164e+00 -1.911e+00 > > > > --------- snip ----------- > > > > The problem with f1() is that it will clobber a variable named .subs in the > > global environment; the problem with f2() is that .subs can be masked by a > > variable in the global environment. > > > > Is there a better approach? > > > > I think there is something wrong with R here since the formula in the > call component of mod.1 has a "call" class whereas the corresponding > call component of mod.2 has "formula" class: > > > class(mod.1$call[[2]]) > [1] "call" > > class(mod.2$call[[2]]) > [1] "formula" > > If we reset call[[2]] to have "call" class then it works: > > > mod.2a <- mod.2 > > mod.2a$call[[2]] <- as.call(as.list(mod.2a$call[[2]])) > > f(mod.2a) > > Call: > lm(formula = Employed ~ GNP.deflator + GNP + Unemployed + Armed.Forces + > Population + Year, data = longley, subset = subs) > > Coefficients: > (Intercept) GNP.deflator GNP Unemployed Armed.Forces > Population Year > 3.641e+03 8.394e-03 6.909e-02 -3.971e-03 -8.595e-03 > 1.164e+00 -1.911e+00 > > > -- > Statistics & Software Consulting > GKX Group, GKX Associates Inc. > tel: 1-877-GKX-GROUP > email: ggrothendieck at gmail.com > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel