On 12-03-30 12:40 PM, Clifton, Abigail J. wrote: > Hi again! >
> Thanks very much for the code, it appears to work! Finally, I want > to extract the coefficients and tried coef(g1), which works. > However, there only appear to be intercepts/coefficients for 'V22N' > out of thousands of possibilities, which are all displayed as > dots/NaN. Is there a way of getting more coefficients - perhaps by > changing lambda or something like that? Is it also possible to > print the final 'model'? I'm afraid I'm out of time right now -- cc'ing to r-help in case someone else has the time and energy to help. All I can suggest is that you spend some time reading through all of the documentation for the package (start with help(package="glmnet") and browse through all the help pages, run the examples, etc. Unfortunately there is no general-purpose vignette for that package ... an entire book on the subject is available online http://www-stat.stanford.edu/~tibs/ElemStatLearn/ , but that won't provide quick answers ... Ben Bolker > Kind regards, > > Abigail > > > -----Original Message----- From: Ben Bolker <bbol...@gmail.com> > Sender: r-help-bounces@r-project.orgDate: Fri, 30 Mar 2012 02:58:04 > To: <r-h...@stat.math.ethz.ch> Subject: Re: [R] How to improve, > at all, a simple GLM code > > Abigail Clifton <abigailclifton <at> me.com> writes: > >> I am wanting to find a good predictive model, yes. It's part of a >> project so if I have time after finding the model I may want to >> find some patterns but it's not a priority. I just want the >> model for now (I need the coefficients above all). > >> It's all categorical data, I categorised any continuous data >> before I started trying to fit the glm. > > That's not necessarily a good idea (categorising often loses power > relative to fitting something like an additive model), but OK. > > >> I was unsure of how to get the csv file to you,however, I have >> uploaded it and it should be available for download from here: >> http://www.filedropper.com/prepareddata > > Here's how far I got: > > Prepared_Data <- na.omit(read.csv("Prepared_Data.csv", > header=TRUE)) pd <- Prepared_Data[,-3] ## data minus response > variable > > ## how many levels per variable? lev <- sapply(pd,function(x) > length(unique(x))) > > ## total parameters for n variables par(las=1,bty="l") > plot(cumprod(lev),log="y") > > library(Matrix) m <- sparse.model.matrix(~.^2,data=pd) ## slower > than model.matrix ncol(m) ##8352 columns (!!) > > library(glmnet) g1 <- glmnet(m,Prepared_Data$C3, > family="binomial") > > This doesn't appear to work properly, yet (I get funny values), > but it's the direction I would go ... > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the > posting guide http://www.R-project.org/posting-guide.html and > provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.