On Dec 13, 2007 9:03 PM, Farrel Buchinsky <[EMAIL PROTECTED]> wrote: > I would like to track in which journals articles about a particular disease > are being published. Creating a pubmed search is trivial. The search > provides data but obviously not as an R dataframe. I can get the search to > export the data as an xml feed and the xml package seems to be able to read > it. > > xmlTreeParse(" > http://eutils.ncbi.nlm.nih.gov/entrez/eutils/erss.cgi?rss_guid=0_JYbpsax0ZAAPnOd7nFAX-29fXDpTk5t8M4hx9ytT- > ",isURL=TRUE) > > But getting from there to a dataframe in which one column would be the name > of the journal and another column would be the year (to keep things simple) > seems to be beyond my capabilities. > > Has anyone ever done this and could you share your script? Are there any > published examples where the end result is a dataframe. > > I guess what I am looking for is an easy and simple way to parse the feed > and extract the data. Alternatively how does one turn an RSS feed into a CSV > file?
Try this: library(XML) doc <- xmlTreeParse("http://eutils.ncbi.nlm.nih.gov/entrez/eutils/erss.cgi?rss_guid=0_JYbpsax0ZAAPnOd7nFAX-29fXDpTk5t8M4hx9ytT-", isURL = TRUE, useInternalNodes = TRUE) sapply(c("//author", "//category"), xpathApply, doc = doc, fun = xmlValue) ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.