A few things that will help you, if not now then in the future:

1) Preallocate the result object.  This allow you to avoid using
cbind()/rbind(), which constantly creates a new large copy in each
iteration.  That will eventually bite you if you have a lot of data.
In your case you know the number of files, but maybe not the number of
rows, but that can be inferred in the first iteration.

2) Read only the columns you need.  This will save memory and speed up
the reading, especially for large data files. In read.table() you can
specify 'colClasses' and set it to "NULL" for unwanted columns.  If
you know the number of columns in each file, say it is 23, the do:
colClasses <- rep("NULL", 23); colClasses[4] <- "double" (if it is
doubles you are reading).

This is how I would do it.  It works for small and rather large data sets.

pathnames <- dir(pattern="data");
nbrOfFiles <- length(pathnames);
colClasses <- rep("NULL", nbrOfFiles); colClasses[4] <- "double";
res <- NULL;
for (kk in seq(length=nbrOfFiles)) {
  pathname <- pathnames[kk];
  values <- read.table(pathname, colClasses=colClasses)[,1];
  if (is.null(res)) {
     # Allocate a matrix of the same data type as the data read.
     res <- matrix(values[1], nrow=length(values), ncol=nbrOfFiles);
  }
  res[,kk] <- values;
  rm(values);
}

My $.02

/Henrik


On Wed, Jul 23, 2008 at 4:24 AM, Henrique Dallazuanna <[EMAIL PROTECTED]> wrote:
> Maybe:
>
> sapply(lapply(dir(pattern="data"), read.table), '[[', 4)
>
> On Wed, Jul 23, 2008 at 5:21 AM, Daren Tan <[EMAIL PROTECTED]> wrote:
>>
>> Better approach than this brute force ?
>>
>> mm <- NULL
>> for (i in dir(pattern="data")) { m <- readTable(i); mm <- cbind(mm, m[,4]) }
>> _________________________________________________________________
>>
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to