on 12/10/2008 12:50 PM Chris Poliquin wrote: > Hi, > > I need to read in a series of text files with a time series on each > row. The series are of different lengths and I'd like to just use the > first row as the length and have R ignore extra values in rows that go > over this length. > > For example: > > 1 0 3 4 5 > 1 3 5 6 8 7 7 > 2 1 1 1 4 7 7 7 > > So the 7s would be ignored and I would have a 5x3 matrix. I tried > creating a series of colClasses with NULLs for the extra values by using > max(count.fields(file)) - min(count.fields(file)) but this didn't work > and would be too time consuming for lots of files. > > fill=T doesn't seem to be working either. When I use fill=T I get extra > rows for some reason in the table. R doesn't seem to just be appending > NAs to the end of the short rows. > > Any way to accomplish this? > > - Chris
Not sure why you had issues with 'fill = TRUE'. Presuming that you do not know 'a priori' the resultant matrix size, you could do something like the following. Essentially, use read.table() to get the following initial result, filling in the short rows, converting the 7's to NA values: DF <- read.table("clipboard", fill = TRUE, na.strings = 7) > DF V1 V2 V3 V4 V5 V6 V7 V8 1 1 0 3 4 5 NA NA NA 2 1 3 5 6 8 NA NA NA 3 2 1 1 1 4 NA NA NA We can then use complete.cases() on the transposed data frame to get the indices of the columns that have NAs: > complete.cases(t(DF)) [1] TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE Thus: > DF[, complete.cases(t(DF))] V1 V2 V3 V4 V5 1 1 0 3 4 5 2 1 3 5 6 8 3 2 1 1 1 4 HTH, Marc Schwartz ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.