On Wed, Nov 25, 2009 at 12:19 AM, cls59 <ch...@sharpsteen.net> wrote:
>
>
> Peng Yu wrote:
>>
>> I'm interested in parsing an html page. I should use XML, right? Could
>> you somebody show me some example code? Is there a tutorial for this
>> package?
>>
>
> Did you try looking through the help pages for the XML package or browsing
> the Omegahat website?
>
> Look at:
>
>  library(XML)
>  ?htmlTreeParse
>
> And the relevant web page for documentation and examples is:
>
>  http://www.omegahat.org/RSXML/


http://www.omegahat.org/RSXML/shortIntro.html

I'm trying the example on the above webpage. But I'm not sure why I
got the following error. Would you help to take a look?


$ Rscript main.R
> library(XML)
>
> download.file('http://www.omegahat.org/RSXML/index.html','index.html')
trying URL 'http://www.omegahat.org/RSXML/index.html'
Content type 'text/html; charset=ISO-8859-1' length 3021 bytes
opened URL
==================================================
downloaded 3021 bytes

>
> doc = xmlInternalTreeParse("index.html")
Opening and ending tag mismatch: dd line 68 and dl
Opening and ending tag mismatch: li line 67 and body
Opening and ending tag mismatch: dt line 66 and html
Premature end of data in tag dd line 64
Premature end of data in tag li line 63
Premature end of data in tag dt line 62
Premature end of data in tag dl line 61
Premature end of data in tag body line 5
Premature end of data in tag html line 1
Error: 1: Opening and ending tag mismatch: dd line 68 and dl
2: Opening and ending tag mismatch: li line 67 and body
3: Opening and ending tag mismatch: dt line 66 and html
4: Premature end of data in tag dd line 64
5: Premature end of data in tag li line 63
6: Premature end of data in tag dt line 62
7: Premature end of data in tag dl line 61
8: Premature end of data in tag body line 5
9: Premature end of data in tag html line 1
Execution halted

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to