I am not sure if below ways are better.

sub(".*>(.*)<.*","\\1",thePrices)

sapply(thePrices, function(x){s=gregexpr(pattern
='\\$',x)[[1]][1];e=gregexpr(pattern
='<',x)[[1]][1];return(substr(x,s,e-1))})


On Wed, Jan 29, 2014 at 9:29 AM, Keith S Weintraub <kw1...@gmail.com> wrote:

> Folks,
>
> I got the following prices by scraping a web page just for my own
> edification:
>
> thePrices<-
> c("id=\"p0\">$69.95</div>", "id=\"p1\">$44.95</div>",
> "id=\"p2\">$69.95</div>",
> "id=\"p3\">$59.95</div>", "id=\"p4\">$69.95</div>",
> "id=\"p5\">$79.95</div>",
> "id=\"p6\">$89.95</div>", "id=\"p7\">$59.95</div>",
> "id=\"p8\">$59.95</div>",
> "id=\"p9\">$79.95</div>", "id=\"p10\">$79.95</div>",
> "id=\"p11\">$89.95</div>",
> "id=\"p12\">$89.95</div>", "id=\"p13\">$79.95</div>",
> "id=\"p14\">$89.95</div>",
> "id=\"p15\">$79.95</div>", "id=\"p16\">$39.95</div>",
> "id=\"p17\">$59.95</div>",
> "id=\"p18\">$69.95</div>", "id=\"p19\">$83.95</div>",
> "id=\"p20\">$73.95</div>",
> "id=\"p21\">$83.95</div>", "id=\"p22\">$93.95</div>",
> "id=\"p23\">$87.95</div>",
> "id=\"p24\">$91.95</div>", "id=\"p25\">$99.95</div>",
> "id=\"p26\">$61.95</div>\""
> )
>
> Using lapply and strsplit (twice) unlist etc. I was able to get the result
> I wanted (the prices as numbers e.g. 59.95)  but I am sure that there is a
> much better way that someone might be able to point out for me.
>
> Note that I tried various regexes which didn't work.
>
> Is part of the difficulty that the strings in thePrices have multiple \"'s
> in them?
>
> Thanks for your time,
> Best,
> KW
>
> --
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to