I've examining a number of linear regression models on a large dataset following the basic ideas presented here http://www.r-bloggers.com/r-calculating-all-possible-linear-regression-models-for-a-given-set-of-predictors/ Calculating all possible linear regressions . I run into a problem with ldply when I have a formula that includes no intercept. Here's a simple test to show what happens.
# data and two linear model regressions xy <- data.frame(cbind(x=(0:10),y=2*x + 0.2*rnorm(11))) models <- as.list(c('y ~ x', 'y ~ -1 + x')) models <- lapply(models, function(x) (as.formula(x)) ) fits <- lapply(models, function(x) lm(x, data=xy)) # regression summaries specified individually (OK) coef(summary(fits[[1]])) # Estimate Std. Error t value Pr(>|t|) # (Intercept) -0.0594176 0.10507394 -0.5654837 5.855640e-01 # x 2.0163534 0.01776074 113.5286997 1.620614e-15 coef(summary(fits[[2]])) # Estimate Std. Error t value Pr(>|t|) # x 2.007865 0.00916494 219.0811 9.652427e-20 # Coefficients as a dataframe using ldply (OK) ldply(fits, function(x) as.data.frame(t(coef(x)))) # (Intercept) x # 1 -0.0594176 2.016353 # 2 NA 2.007865 # Std Errors as a dataframe using ldply (FAIL) # variable name 'x' is missed in the second model which has no intercept. Default variable # name V1 is added to the output instead. # The same behaviour is observed for 't value' and 'Pr(>|t|)' ldply(fits, function(x) as.data.frame(t(coef(summary(x))[,'Std. Error']))) # (Intercept) x V1 # 1 0.1050739 0.01776074 NA # 2 NA NA 0.00916494 Is this a bug or (hopefully) user error? Any ideas for a workaround? Thanks. -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-ldply-tp2219094p2219094.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.