Re: [R] read.table() versus scan()

Tal Galili Fri, 28 Jan 2011 00:24:26 -0800

Hi Roark,

>From my experience, this error is because of problem with reading the
headers, or problem with the "sep" parameter in read.table
Try something like
read.table(... ,sep ="\t")  (This is for tab delimited files)


Others might give more ideas.

Cheers,
Tal



----------------Contact
Details:-------------------------------------------------------
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
----------------------------------------------------------------------------------------------




On Fri, Jan 28, 2011 at 6:23 AM, H Roark <hrbuil...@hotmail.com> wrote:

>
> I need to import a large number of simple, space-delimited text files with
> a few columns of data each. The one quirk is that some rows are missing data
> and some contain junk text at the end of each line. A typical file might
> look like:
>
> a b c d
> 1 2 3 x
> 4 5 6
> 7 8 9 x
> 1 2 3 x c c
> 4 5 6 x
> 7 8 9 x
>
> I'm trying to avoid having to pre-process the text files, as they all sit
> on an ftp site that I don't manage.  My initial approach was just to read
> the files using a read.table() statement with the arguments flush and fill
> set to TRUE. For example, to import the above text file I tried:
>
> read.table(file="ftp://ftp.example.dta";, header=T, row.names=NULL, fill=T,
> flush=T)
>
> However, R throws the error "more columns than column names" and won't
> import the file.
>
> Interestingly, if I move the extra text "c c" from line 5 to line 6 in the
> data file, read.table() reads the file just fine, and ignores the "c c".
>  So, my first question is, why does simply moving these data down a row
> solve this problem?
>
> Next, I decided to try reading the file with the scan() function and it
> worked perfectly:
>
> data.frame(scan(file="ftp://ftp.example.dta";, what=list(a=0, b=0, c=0,
> d=""), sep=" ", skip=1, flush=T, fill=T))
>
> I'm new to R, but as I understand it read.table() is based on the scan()
> function. This makes me wonder if there is an additional argument I can add
> to read.table() to make it import the file successfully, as scan() was able
> to do.  Any help in this regard would be very much appreciated.  I'd also
> really like to hear folks' perspectives on the merits of scan() versus
> read.table() (e.g. when is scan() the best option?).
>
> Cheers
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read.table() versus scan()

Reply via email to