Dear ExpeRts,t I am trying to read tab delimted data produced by somewhat brain dead software that seems to think it's a good idea to have an extra tab character after the last column - except for the header line. As explained in the help page, read.delim now assumes that the first column contains the row.names (which is not even wrong) but now and all col.names get shiftet by one column. Example:
infile <- 'sample\tx1\n1\tA\t\n2\tB\t\n3\tA\t' read.delim(textConnection(infile)) sample x1 1 A NA 2 B NA 3 A NA So I set row.names to NULL because the man page said "Using ‘row.names = NULL’ forces row numbering.". Now the row.names really are numbered automatically but I get a "bonus column": read.delim(textConnection(infile), row.names=NULL) row.names sample x1 1 1 A NA 2 2 B NA 3 3 A NA Hm - not what I want. I am also a bit puzzeled why the extra column is introduced instead of just using the first col.name. At the moment I deal with it by fixing the col.names and dumping the extra column: dat <- read.delim(textConnection(infile), row.names=NULL) colnames(dat) <- colnames(dat)[-1] dat <- dat[-ncol(dat)] dat sample x1 1 1 A 2 2 B 3 3 A I worked my way through ?read.delim but could not find an option to deal with these (flawed) files directly. As the opposite situation (i.e. more col.names than data) can be fixed with fill=T I was hoping something like fill.header=T or fill='header' may exist. Did I just not find it or does it not exist? And if it doesn't - does anyone else think it would be a nice item for the wishlist? cu Philipp -- Dr. Philipp Pagel Lehrstuhl für Genomorientierte Bioinformatik Technische Universität München Wissenschaftszentrum Weihenstephan Maximus-von-Imhof-Forum 3 85354 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.