On Jun 26, 2014, at 12:59 AM, Ron Crump wrote:
> Hi Carol,
>> It might be a primitive question but I have a file of text and there is no
>> separator between character on each line and the strings on each line have
>> the same length. The format is like the following
>>
>> absfjdslf
>> jfdldskjff
>> jfsldfjslk
>>
>> When I read the file with read.table("myfile",colClasses = "character"),
>> instead of putting the strings in a table of number of rows x length of
>> string, read.table saves the file in a table of number of rows x 1 and each
>> element seems to be a factor. Why does read.table not account for
>> colClasses = "character"?
> read.table relies on a separator to differentiate between columns, so it is
> not appropriate for your file, read.fwf would do the job.
>
> Setting colClasses (in my understanding) tells read.table how to treat input
> as it comes in - so it disables some testing of data types and makes reading
> quicker, it does not disable the setting of character data to be factors,
> which is the default. You need to use the stringsAsFactors=FALSE option for
> that.
>
> So, for your example (and I have added a letter to the first row to make it
> the same length as the others):
>
> cf <- "absfjdslfx
> jfdldskjff
> jfsldfjslk"
>
> cdf <-
> read.fwf(textConnection(cf),widths=rep(1,10),colClasses="character",stringsAsFactors=FALSE)
You are wrong about colClasses not pre-empting stringsAsFactors, and it's easy
enough to prove (or you could have taken the time to read the help page of
read.table which the help page of read.fwf refers you to:
sapply( read.fwf(textConnection(cf),widths=rep(1,10),colClasses="character")
,class)
V1 V2 V3 V4 V5 V6
V7 V8 V9 V10
"character" "character" "character" "character" "character" "character"
"character" "character" "character" "character"
> See ?read.fwf for more information. A width is required for each column (in
> this case 1 repeated 10 times).
>
> Hope this helps.
>
> Ron.
>
> ______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius
Alameda, CA, USA
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.