Re: [R] Reading in a transcript-like file

David Winsemius Wed, 30 Jun 2010 06:55:22 -0700


On Jun 30, 2010, at 12:21 AM, ARRRRRR wrote:

http://r.789695.n4.nabble.com/file/n2272669/FT20100626_%2420_%2B_%242_Sit_%26_Go_-_%28169112900%29_-_Summary.txt
FT20100626_%2420_%2B_%242_Sit_%26_Go_-_%28169112900%29_-_Summary.txt
I have a lot of experience with Stata, but I'm new to R. I'm tryingto readthe attached file into R on my mac. My goal is to have it as alist, witheach element a string - from then I can parse out the data I needand add it
as an observation in a dataframe.

I've tried scan, readlines, etc. but I'm stumped.  I've been adding
encoding="UTF-16", but that doesn't seem to help much.
The closest I've come is:
test<-scan(file="FT20100626 $20 + $2 Sit & Go - (169112900) -Summary.txt",
what=list(""), flush=FALSE, skip=0, encoding="UTF-16", quote="\n")

which gives me a list wherein each element is first letter of the row.
test
[[1]]
[1] "\xff\xfeF" "T"         "P"         "T"         "S"         "$"
"+"         "$"         "S"

I believe you are being bitten by an encoding issue and that it isreferred to by this section of the help page from ?connections:

"The encoding "UCS-2LE" is treated specially, as it is the appropriatevalue for Windows ‘Unicode’ text files. If the first two bytes are theByte Order Mark 0xFFFE then these are removed as most implementationsof iconv do not accept BOMs. Note that some implementations willhandle BOMs using encoding "UCS-2" but many will not."

Notice the your first two entries are \xff\xfe which I believe is arepresentation of 0xFFFE. When you look at that page with FireFox andrequest encoding information you are given UTF-16. I am notsufficiently educated on encoding issues even though we shareplatforms. I tried a few different encoding specifications including"UTF-16", "UCS-2" and "UCS-2LE" with scan and readLines but failed towork through to the solution. Another possiblity might be to subscribeto the R SIG-Mac mailing list and post the question there.


--
David.

"[10] "&"         "G"         "("         "H"         "N"         "L"
"B"         "u"         "$"
[19] "+"         "$"         "B"         "u"         "C"         "1"
"6"         "E"         "T"
[28] "o"         "P"         "P"         "$"         "T"         "o"
"s"         "2"         "0"
[37] "E"         "T"         "o"         "f"         "2"         "1"
"E"         "\n"        "1"
[46] "B"         "$"         "2"         ":"         "J"         "$"
"3"         ":"         "b"
[55] "4"         ":"         "s"         "c"         "2"         "5"
":"         "R"         "6"
[64] ":"         "S"         "B"         "o"         "f"         "i"
"1"         "p"

Any help would be greatly appreciated.

--
View this message in context: 
http://r.789695.n4.nabble.com/Reading-in-a-transcript-like-file-tp2272669p2272669.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading in a transcript-like file

Reply via email to