Prof Brian Ripley wrote: > On Fri, 21 Oct 2005, Emmanuel Paradis wrote: > >> Prof Brian Ripley wrote: >> >>> On Thu, 20 Oct 2005 [EMAIL PROTECTED] wrote: >>> >>>> Full_Name: Emmanuel Paradis >>>> Version: 2.1.1 >>>> OS: Linux >>>> Submission from: (NULL) (193.49.41.105) >>>> >>>> >>>> read.fwf(..., header = TRUE) does not work properly since: >>>> >>>> 1/ the original header is printed on the console and not in FILE; >>>> 2/ the different 'parts' of the header should be separated with tabs >>>> to work with the call to read.table. >>>> >>>> Here is a suggested fix for src/library/utils/R/read.fwf.R: >>>> >>>> 38c38,40 >>>> < cat(FILE, headerline, "\n") >>>> --- >>>> >>>>> headerline <- unlist(strsplit(headerline, " {1,}")) >>>>> headerline <- paste(headerline, collapse = "\t") >>>>> cat(file = FILE, headerline, "\n") >>> >>> >>> >>> Thanks, but I don't think that is right. It assumes the header line >>> is space-delimited (or at least that spaces get converted to tabs). >>> We have not specified the format of the header line, and it cannot >>> usefully be fixed format. So I think we need to specify it is >>> delimited by 'sep' >>> (not tab). >> >> >> I see, but suppose we read selectively some columns in a file, eg with >> widths=c(1, -4, 2), how can we know how many variables have been >> skipped and then select the appropriate names in the header line? > > > You do not: as the help file says > > Negative-width fields are used to indicate columns to be skipped, > eg '-5' to skip 5 columns. These fields are not seen by > 'read.table' and so should not be included in a 'col.names' or > 'colClasses' argument.
OK, but it is strange to me to not have all variables named in a header line. >> Here is another proposed fix, but this assumes the header line is in >> fixed-width format (as specified by 'widths'): > > > What happens if there are multi-line records? Your `fix' crashes. It crashes anyway because it should be [!drop] and not [drop] ;) >> 38c38,41 >> < cat(FILE, headerline, "\n") >> --- >> >>> head.last <- cumsum(widths) >>> head.first <- head.last - widths + 1 >>> headerline <- substring(headerline, head.first, head.last)[drop] >>> cat(file = FILE, headerline, "\n", sep = sep) >> >> >> ?read.fwf says clearly that sep is used internally. > > > Not so: please check the current version. Here is what I have in R 2.2.0: sep: character; the separator used internally; should be a character that does not occur in the file. So, should the fix be simply: 38c38 < cat(FILE, headerline, "\n") --- > cat(file = FILE, headerline, "\n") ? ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel