In the course of some work I have been doing for Revolution Analytics I have had the necessity of modifying the plm function so that it would not die halfway through fitting. In particular, I was able to more than halve the runtime (for my particular run) and improve its memory usage with three small modifications:
1.) Replacing throughout apply(X, 2, mean) with colMeans, similarly with colSums() 2.) In pdata.frame() Replacing # n <- length(Ti) # time <- c() # for (i in 1:n){ # time <- c(time, 1:Ti[i]) # } with 'time <- sequence(Ti)' 3.) To uncork the particular bottleneck I was experiencing in Tapply (the fitting would die halfway through a massive tapply() ) I have modified the function to process things in chunks. By still using tapply we do not give up too much efficiency and gain the ability to fit much larger models. Here is the down-and-dirty code, set at the moment to do everything in one go, but controllable via 'num_blocks' or 'block_size'. A nice way to handle this would be for it to be left as a parameter that, by default, is set to do the entire data set at once. Tapply.default <- function (x, effect, func, ...) { na.x <- is.na(x) effect_unique <- unique(effect) n_effects <- length( effect_unique ) uniqval <- array(dim=n_effects) attr(uniqval, "dimnames")[[1]] <- as.character(effect_unique) # change this back so that it can handle larger datasets block_size <- n_effects num_blocks <- ceiling( n_effects / block_size ) for( i in 1:num_blocks ){ these_ind <- ((i-1)*block_size + 1):min(n_effects, (i*block_size)) these_effects <- effect_unique[ these_ind ] this_x <- x[ effect %in% these_effects ] this_effect <- factor(effect[ effect %in% these_effects ] ) uniqval[these_ind] <- tapply(this_x, this_effect, func, ...) } nms <- attr(uniqval, "dimnames")[[1]] attr(uniqval, "dimnames") <- attr(uniqval, "dim") <- NULL names(uniqval) <- nms result <- uniqval[as.character(effect)] result[na.x] <- NA result } Again, Revolution Analytics is to thank for these improvements, should they make it into the package. I am happy to work with the authors to see that this is incorporated. Thanks, as always, to Yves and everyone else volunteering their time and expertise. Kyle [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel