Hi Roark, >From my experience, this error is because of problem with reading the headers, or problem with the "sep" parameter in read.table Try something like read.table(... ,sep ="\t") (This is for tab delimited files)
Others might give more ideas. Cheers, Tal ----------------Contact Details:------------------------------------------------------- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) ---------------------------------------------------------------------------------------------- On Fri, Jan 28, 2011 at 6:23 AM, H Roark <hrbuil...@hotmail.com> wrote: > > I need to import a large number of simple, space-delimited text files with > a few columns of data each. The one quirk is that some rows are missing data > and some contain junk text at the end of each line. A typical file might > look like: > > a b c d > 1 2 3 x > 4 5 6 > 7 8 9 x > 1 2 3 x c c > 4 5 6 x > 7 8 9 x > > I'm trying to avoid having to pre-process the text files, as they all sit > on an ftp site that I don't manage. My initial approach was just to read > the files using a read.table() statement with the arguments flush and fill > set to TRUE. For example, to import the above text file I tried: > > read.table(file="ftp://ftp.example.dta", header=T, row.names=NULL, fill=T, > flush=T) > > However, R throws the error "more columns than column names" and won't > import the file. > > Interestingly, if I move the extra text "c c" from line 5 to line 6 in the > data file, read.table() reads the file just fine, and ignores the "c c". > So, my first question is, why does simply moving these data down a row > solve this problem? > > Next, I decided to try reading the file with the scan() function and it > worked perfectly: > > data.frame(scan(file="ftp://ftp.example.dta", what=list(a=0, b=0, c=0, > d=""), sep=" ", skip=1, flush=T, fill=T)) > > I'm new to R, but as I understand it read.table() is based on the scan() > function. This makes me wonder if there is an additional argument I can add > to read.table() to make it import the file successfully, as scan() was able > to do. Any help in this regard would be very much appreciated. I'd also > really like to hear folks' perspectives on the merits of scan() versus > read.table() (e.g. when is scan() the best option?). > > Cheers > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.