<abigailclifton <at> me.com> writes:
> I am trying to fit a generalised linear model to some loan > application and default data. The purpose of this is to eventually > work out the probability an applicant will default. > However, R seems to crash or die when I run "glm" on anything > greater than a 5-way saturated model for my data. What does "crash or die" mean? Are you getting error messages? What are they? Is the R application actually quitting? > My first question: is the best way to fit a generalised linear model > in R to fit the saturated model and extract the significant terms > only, or to start at the null model and to work up to the optimum > one? This is more of a statistical practice question than an R question. Opinions differ but in general I would say if it is computationally feasible that you should start (and maybe finish) with the full model. > I am importing a csv file with 3500 rows and 27 columns (3500x27 matrix). > My second question: is there anyway to increase the memory > I have so R can cope with more analysis? help("Memory-limits") > > I can send my code if it would help to answer the question. Please read the posting guide (link at the bottom of every R-help posting) and follow its advice. We don't know enough about your situation to help. You could also try reading http://tinyurl.com/reproducible-000 ... This works for me: z <- matrix(rnorm(3500*27),ncol=27) y <- sample(0:1,replace=TRUE,size=3500) colnames(z) <- c(letters,"A") d <- data.frame(y=y,z) gg <- glm(y~.,data=d,family="binomial") gg <- glm(y~a*b*c*d*e*f*g*h,data=d,family="binomial") ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.