tmp <- data.frame(y=rnorm(15000), x1 <- factor(sample(48, 15000, replace=TRUE)), z1 <- factor(sample(242, 15000, replace=TRUE))) system.time( tmp.aov <- aov(y ~ x1/z1, data=tmp) ) ## exceeds memory
tmp2 <- data.frame(y=rnorm(15000), x1 <- factor(sample(48, 15000, replace=TRUE)), z1 <- factor(sample(5, 15000, replace=TRUE))) system.time( tmp2.aov <- aov(y ~ x1/z1, data=tmp2) ) anova(tmp2.aov) ## about 5 seconds Use data.frames. They make it easier to read. Use aov() instead of lm(). It is the same arithmetic, but the unneeded columns of X are handled more gracefully. My guess is that your data has 100s of distinct values for z1. Therefore excess space was allocated. It is easier to understand with distinct values of z1, but as you see it is costly in computer resources. You can force the actual numerical values of the second term to be distinct across levels of x1 with the interaction() function. Then use the simpler model and let the linear dependencies work in your favor. system.time( tmp.aov <- aov(y ~ x1 + interaction(x1, z1), data=tmp) ) anova(tmp.aov) ## about 6 seconds Rich ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.