Cade, Brian <cadeb <at> usgs.gov> writes: > > It has never been obvious to me that the lasso approach can handle > interactions among predictor variables well at all. > I'ld be curious to see > what others think and what you learn. > > Brian >
For what it's worth I think lasso *does* handle interactions reasonably (although I forget where I read that) -- there is a newer "hierarchical lasso" that tries to deal with marginality concerns more carefully. Related questions asked on StackOverflow: http://stackoverflow.com/questions/37910042/glmmlasso-warning-messages/ 37922918#37922918 (warning, broken URL) My answer (in comments) there was my guess is that you're going to have to build your own model matrix/dummy variables; I think that as.factor() in formulas is treated specially, so including the interaction term will probably just confuse it. (It would be worth trying as.factor(Novelty:ROI) - I doubt it'll work but if it does it would be the easiest way forward.) > > On Wed, Jul 13, 2016 at 2:20 PM, Walker Pedersen <wsp <at> uwm.edu> wrote: [snip] > > > > An abbreviated version of my dataset is here: > > > > https://drive.google.com/open?id=0B_LliPDGUoZbVVFQS2VOV3hGN3c > > [snip snip] > > Before glmmLasso I am running: > > > > KNov$Subject <- factor(KNov$Subject) > > > > to ensure the subject ID is not treated as a continuous variable. > > > > If I run: > > > > glm1 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) + > > STAIt + as.factor(ROI) > > + as.factor(Valence):as.factor(ROI), list(Subject=~1), data = KNov, > > lambda=10) > > summary(glm1) > > > > I don't get any warning messages, but the output contains b estimates > > only, no SE or p-values. > > > > If I try to include a 3-way interaction, such as: > > > > glm2 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) + > > STAIt + as.factor(ROI) > > + as.factor(Novelty):as.factor(Valence):as.factor(ROI), > > list(Subject=~1), data = Nov7T, lambda=10) > > summary(glm2) > > > > I get the warnings: > > > > Warning messages: > > 1: In split.default((1:ncol(X))[-inotpen.which], ipen) : > > data length is not a multiple of split variable > > 2: In lambda_vec * sqrt(block2) : > > longer object length is not a multiple of shorter object length > > > > And again, I do get parameter estimates, and no SE or p-values. > > > > If I include my continuous variable in any interaction, such as: > > > > glm3 <- glmmLasso(Activity~as.factor(Novelty) + as.factor(Valence) + > > STAIt + as.factor(ROI) > > + as.factor(Valence):as.factor(ROI) + as.factor(Novelty):STAIt, > > list(Subject=~1), data = Nov7T, lambda=10) > > summary(glm3) > > > > I get the error message: > > > > Error in rep(control$index[i], length.fac) : invalid 'times' argument > > > > and no output. > > > > If anyone has an input as to (1) why I am not getting SE or p-values > > in my outputs (2) the meaning of there warnings I get when I include a > > 3-way variable, and if they are something to worry about, how to fix > > them and (3) how to fix the error message I get when I include my > > continuous factor in an interatction, I would be very appreciative. [snip snip snip] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.