Hi,
In addition, you could also do:
gsub(".*www\\.([[:alnum:]]+\\.[[:alnum:]]+).*","\\1",url)
#[1] "mdd.com" "mdd.com" "mdd.com" "genius.com" "google.com"
gsub(".*www\\.([[:alnum:]]+\\.[[:alnum:]]+).*","\\1",url2)
#[1] "mdd.com" "mdd.com" "mdd.edu" "genius.gov" "google.com"
gs
Try:
gsub(".*\\.com","",url)
[1] "/food/pizza/index.html" "/build-your-own/index.html"
[3] "/special-deals.html" "/find-a-location.html"
[5] "/hello.html"
gsub(".*www\\.([[:alpha:]]+\\.com).*","\\1",url)
#[1] "mdd.com" "mdd.com" "mdd.com" "genius.com" "go
Hi,
The XML package has a nice function, parseURI(), that nicely slice and dices
the url.
library(XML)
parseURI('http://www.mdd.com/food/pizza/index.html')
Might that help?
Cheers,
Ben
On Mar 6, 2014, at 12:23 PM, Abraham Mathew wrote:
> Let's say that I have the following character vecto
Oh, that's perfect. I can just use one of the apply functions to run that
each url and then extract the methods that I need.
Thanks!
On Thu, Mar 6, 2014 at 11:52 AM, Ben Tupper wrote:
> Hi,
>
> The XML package has a nice function, parseURI(), that nicely slice and
> dices the url.
>
> libr
See the parse_url function in the httr package. It does all this and more.
On Mar 6, 2014 2:45 PM, "Sarah Goslee" wrote:
> There are many ways to do this. Here's a simple version and a slightly
> fancier version:
>
>
> url = c("http://www.mdd.com/food/pizza/index.html";,
> "http://www.mdd.com/bui
There are many ways to do this. Here's a simple version and a slightly
fancier version:
url = c("http://www.mdd.com/food/pizza/index.html";,
"http://www.mdd.com/build-your-own/index.html";,
"http://www.mdd.com/special-deals.html";,
"http://www.genius.com/find-a-location.html";,
"http://www.google
Let's say that I have the following character vector with a series of url
strings. I'm interested in extracting some information from each string.
url = c("http://www.mdd.com/food/pizza/index.html";, "
http://www.mdd.com/build-your-own/index.html";,
"http://www.mdd.com/special-deals.html";
7 matches
Mail list logo