[R] htmlParse (from XML library) working sporadically in the same code

Andre Zege Wed, 20 Mar 2013 10:08:41 -0700

I am using htmlParse from XML library on a paricular website. Sometimes code 
fails, sometimes it works, most of the time id doesn't and i cannot see why. 
The file i am trying to parse is


http://www.londonstockexchange.com/exchange/prices-and-markets/international-markets/indices/home/sp-500.html?page=0


Sometimes the following code works
n<-readHTMLTable(htmlParse(url))


But most of the time it would return the following error coming from htmlParse:

Error: failed to load HTTP resource


Error is coming from the following line in htmlParse code:
 
  ans <- .Call("RS_XML_ParseTree", as.character(file), handlers, 
as.logical(ignoreBlanks), as.logical(replaceEntities), as.logical(asText), 
as.logical(trim), as.logical(validate), as.logical(getDTD), as.logical(isURL), 
as.logical(addAttributeNamespaces), as.logical(useInternalNodes), 
as.logical(isHTML), as.logical(isSchema), as.logical(fullNamespaceInfo), 
as.character(encoding), as.logical(useDotNames), xinclude, error, addFinalizer, 
as.integer(options), PACKAGE = "XML")



By the way, readHTMLTable(htmlParse(url)) works fine on other pages, so the 
problem is somehow related to this page. 

I am using 64-bit  R.15.3 version on windows machine

Thanks much
Andre
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] htmlParse (from XML library) working sporadically in the same code

Reply via email to