Ben, It works for me ... > x = rpois(100, 5) + 1 > y = rnorm(100, x) > d = data.frame(x,y) > m <- lm(y~log(x),d) > update(m,data=model.frame(m))
Call: lm(formula = y ~ log(x), data = model.frame(m)) Coefficients: (Intercept) log(x) -4.010 5.817 You can also re-fit using the model.matrix directly. In your example, the model frame can be passed directly to lm.fit /lm.wfit. ~G > sessionInfo() R version 3.0.1 (2013-05-16) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_3.0.1 On Sat, Aug 24, 2013 at 7:40 PM, Ben Bolker <bbol...@gmail.com> wrote: > > Bump: just trying one more time to see if anyone had thoughts on this > (so far it's just <crickets> ...) > > > -------- Original Message -------- > Subject: model.frame(), model.matrix(), and derived predictor variables > Date: Sat, 17 Aug 2013 12:19:58 -0400 > From: Ben Bolker <bbol...@gmail.com> > To: r-de...@stat.math.ethz.ch <r-de...@stat.math.ethz.ch> > > > Dear r-developers: > > I am struggling with some fundamental aspects of model.frame(). > > Conceptually, I think of a flow from data -> model.frame() -> > model.matrix; the data contain _input variables_, while model.matrix > contains _predictor variables_: data have been transformed, splines and > polynomials have been expanded into their corresponding > multi-dimensional bases, and factors have been expanded into appropriate > sets of dummy variables depending on their contrasts. > I originally thought of model.frame() as containing input variables as > well (but with only the variables needed by the model, and with cases > containing NAs handled according to the relevant na.action setting), but > that's not quite true. While factors are retained as-is, splines and > polynomials and parameter transformations are evaluated. For example > > d <- data.frame(x=1:10,y=1:10) > model.frame(y~log(x),d) > > produces a model frame with columns 'y', 'log(x)' (not 'y', 'x'). > > This makes it hard (impossible?) to use the model frame to re-evaluate > the existing formula in a model, e.g. > > m <- lm(y~log(x),d) > update(m,data=model.frame(m)) > ## Error in eval(expr, envir, enclos) : object 'x' not found > > It seems to me that this is a reasonable thing to want to do > (i.e. use the model frame as a stored copy of the data that > can be used for additional model operations); otherwise, I > either need to carry along an additional copy of the data in > a slot, or hope that the model is still living in an environment > where it can find a copy of the original data. > > Does anyone have any insights into the original design choices, > or suggestions about how they have handled this within their own > code? Do you just add an additional data slot to the model? I've > considered trying to write some kind of 'augmented' model frame, that > would contain the equivalent of > setdiff(all.vars(formula),model.frame(m)) [i.e. all input variables > that appeared in the formula but not in the model frame ...]. > > thanks > Ben Bolker > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > -- Gabriel Becker Graduate Student Statistics Department University of California, Davis [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel