hi thomas--- thanks for the answer. the problem with the "+factor(fmid)" is not just that it provides uninteresting coefficients and that it eats more memory, but that it is also MUCH slower when there are (hundred of) thousands of fixed effects.
Does Bill Venables describe how to do extend the lm() function? I googled "course notes on advanced programming Venables", but did not find it. Do you have a better link? (hopefully, this is a short explanation---I know the algorithm. I want to learn how to coax it into an lm statement.) regards, /iaw ---- Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com) I would just have used lm(y~x+z+factor(firmid)). Admittedly, you get a > whole bunch of uninteresting coefficients in the output, but it's not that > hard to subset them out. > > There are two implementation of this in Bill Venables' course notes on > advanced programming. I think they are also in 'S Programming', but I can't > find my copy right now. These were motivated by computational problems: the > full design matrix for the linear model was too large for memory at the time > (last century). > > > As a final note, I would strongly discourage > > r= lm( y ~ x + z, fixed.effects=firmid ) > as a specification, and would argue for > > r= lm( y ~ x + z, fixed.effects=~firmid ) > > I think the ability to have some subset of the arguments in a modelling > call silently treated as formulas was a bad decision, although it must have > looked user-friendly at the time. > > -thomas > > Thomas Lumley Assoc. Professor, Biostatistics > tlum...@u.washington.edu University of Washington, Seattle > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.