On Jun 30, 2010, at 12:21 AM, ARRRRRR wrote:
http://r.789695.n4.nabble.com/file/n2272669/FT20100626_%2420_%2B_%242_Sit_%26_Go_-_%28169112900%29_-_Summary.txt
FT20100626_%2420_%2B_%242_Sit_%26_Go_-_%28169112900%29_-_Summary.txt
I have a lot of experience with Stata, but I'm new to R. I'm trying
to read
the attached file into R on my mac. My goal is to have it as a
list, with
each element a string - from then I can parse out the data I need
and add it
as an observation in a dataframe.
I've tried scan, readlines, etc. but I'm stumped. I've been adding
encoding="UTF-16", but that doesn't seem to help much.
The closest I've come is:
test<-scan(file="FT20100626 $20 + $2 Sit & Go - (169112900) -
Summary.txt",
what=list(""), flush=FALSE, skip=0, encoding="UTF-16", quote="\n")
which gives me a list wherein each element is first letter of the row.
test
[[1]]
[1] "\xff\xfeF" "T" "P" "T" "S" "$"
"+" "$" "S"
I believe you are being bitten by an encoding issue and that it is
referred to by this section of the help page from ?connections:
"The encoding "UCS-2LE" is treated specially, as it is the appropriate
value for Windows ‘Unicode’ text files. If the first two bytes are the
Byte Order Mark 0xFFFE then these are removed as most implementations
of iconv do not accept BOMs. Note that some implementations will
handle BOMs using encoding "UCS-2" but many will not."
Notice the your first two entries are \xff\xfe which I believe is a
representation of 0xFFFE. When you look at that page with FireFox and
request encoding information you are given UTF-16. I am not
sufficiently educated on encoding issues even though we share
platforms. I tried a few different encoding specifications including
"UTF-16", "UCS-2" and "UCS-2LE" with scan and readLines but failed to
work through to the solution. Another possiblity might be to subscribe
to the R SIG-Mac mailing list and post the question there.
--
David.
"[10] "&" "G" "(" "H" "N" "L"
"B" "u" "$"
[19] "+" "$" "B" "u" "C" "1"
"6" "E" "T"
[28] "o" "P" "P" "$" "T" "o"
"s" "2" "0"
[37] "E" "T" "o" "f" "2" "1"
"E" "\n" "1"
[46] "B" "$" "2" ":" "J" "$"
"3" ":" "b"
[55] "4" ":" "s" "c" "2" "5"
":" "R" "6"
[64] ":" "S" "B" "o" "f" "i"
"1" "p"
Any help would be greatly appreciated.
--
View this message in context:
http://r.789695.n4.nabble.com/Reading-in-a-transcript-like-file-tp2272669p2272669.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.