I am scraping data from a web page using XML (excellent package BTW - that's
scraping data the easy way!).
So far, I've got the code:
tables <- readHTMLTable(theurl)
rhf <- tables$tabResHistFull
div1 <- rhf[which(rhf$V1=="Div ps"),]
div1
which is giving me the result:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11
V12 V13 V14 V15
15 Div ps p 32.31 35.64 40.17 42.55 45.13 46.36+17.22 51.11 55.72 70.78 71.72
76.99 82.20 <NA>
I don't know a priori how many columns are in the table.
I want to be able to extract the numbers to the best of my ability, and check
to see if they form a monotonic sequence.
How do I do this?
If I type:
div2 <- div1[is.numeric(div1)]
div2
I get
data frame with 0 columns and 1 rows
OTOH, if I type
div2 <- as.numeric(div1)
div2
I get
[1] 9 4 15 16 18 20 17 21 21 18 24 24 8 8 NA
Huh??
What do I need to do? The data in V8 in this particular instance is
aberrational - I don't care how it gets treated; although I really want the
number 46.36 out of it. Any sane solution will do, though.
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.