On Aug 9, 2013, at 12:52 PM, kevin.shaney wrote: > Thanks! I tried doing the type.multinomial="grouped" argument - but it > didn't work for me. Maybe I did something wrong. I thought I understood why > it didn't work because of sparse.model.matrix recoding variables (like below > to V12 & V13} makes GLMNET unable to tell that they actually came from the > same source categorical variable. Has that option ever worked for you in a > similar situation?
I wondered after posting if using the sparse.model.matrix input could be getting in the way of whatever grouping behavior might be occuring (which is conducted behind the scenes in a non-exported function). I've never attempted using it, and only asked the question because you didn't specifically say that you had used it in the fashion described in help page. -- David. > > Thanks! > Kevin > > From: David Winsemius [via R] > [mailto:ml-node+s789695n4673463...@n4.nabble.com] > Sent: Friday, August 09, 2013 3:14 PM > To: Kevin Shaney > Subject: Re: glmnet inclusion / exclusion of categorical variables > > > On Aug 9, 2013, at 6:44 AM, Kevin Shaney wrote: > >> >> Hello - >> >> I have been using GLMNET of the following form to predict multinomial >> logistic / class dependent variables: >> >> mglmnet=glmnet(xxb,yb ,alpha=ty,dfmax=dfm, >> family="multinomial",standardize=FALSE) >> >> I am using both continuous and categorical variables as predictors, and am >> using sparse.model.matrix to code my x's into a matrix. This is changing an >> example categorical variable whose original name / values is {V1 = "1" or >> "2" or "3"} into two recoded variables {V12= "1" or "0" and V13 = "1" or >> "0"}. > > You set their penalty factors to be 0 to at least observe the case where > inclusion is performed. And setting the penallty factor for both to be small > would allow you to "honestly" use 0 as the estimated coefficient in such > cases where one was estimated and the other not. > >> >> As i am cycling through different penalties, i would like to either have >> both recoded variables included or both excluded, but not one included - and >> can't figure out how to make that work. I tried changing the >> "type.multinomial" option, as that looks like this option should do what i >> want, but can't get it to work (maybe the difference in recoded variable >> names is driving this). > > Doesn't the 'family' argument, used to set what I think you are calling > 'type', just refer to the y argument, rather than the predictors. You may > want: > > mglmnet=glmnet(xxb,yb ,alpha=ty,dfmax=dfm, type.multinomial="grouped", > family="multinomial",standardize=FALSE) > >> >> To summarize, for categorical variables, i would like to hierarchically >> constrain inclusion / exclusion of recoded variables in the model - either >> all of the recoded variables from the same original categorical variable >> are in, or all are out. > > I do understand that I am possibly not directly answering your question, but > in some respect I wonder if it deserves an answer. I think it is meaningful > if some factor levels are "penalized-out" of models. > > -- > David Winsemius > Alameda, CA, USA > David Winsemius Alameda, CA, USA ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.