Hello all,

# I am trying to read the text in this URL:
u <-
http://google.com/complete/search?output=toolbar&q=%d7%a9%d7%9c%d7%95%d7%9d
# By using this command:
readLines(u)

And no matter what variation I tried, I keep getting this output:
[1] "<?xml version=\"1.0\"?><toplevel><CompleteSuggestion><suggestion
data=\"&#x5E9;&#x5DC;&#x5D5;&#x5DD;\"/><   (etc...)
Instead of this output:
<?xml version="1.0"?><toplevel><CompleteSuggestion><suggestion
data="שלום"/><num_queries
int="16800000"/></CompleteSuggestion><CompleteSuggestion><suggestion 
data="שלום
חנוך"/><num_queries
int="232000"/></CompleteSuggestion><CompleteSuggestion><suggestion
data="שלום עליכם"/
(etc....)

I tried:
  readLines(u, encoding= "latin1")
  readLines(u, encoding= "UTF-8")
And also changing Sys.setlocale:
  Sys.setlocale("LC_ALL", "Hebrew") # must be done for Hebrew to work.
  Sys.setlocale("LC_ALL", "English") # must be done for Hebrew to work.

Are there any more options I could try to get this text properly encoded?

Thanks!
Tal



----------------Contact
Details:-------------------------------------------------------
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
----------------------------------------------------------------------------------------------

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to