Re: [R] Analyzing Publications from Pubmed via XML

Rajarshi Guha Thu, 13 Dec 2007 18:13:36 -0800

On Dec 13, 2007, at 9:03 PM, Farrel Buchinsky wrote:

> I would like to track in which journals articles about a particular  
> disease
> are being published. Creating a pubmed search is trivial. The search
> provides data but obviously not as an R dataframe. I can get the  
> search to
> export the data as an xml feed and the xml package seems to be able  
> to read
> it.
>
> xmlTreeParse("
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/erss.cgi? 
> rss_guid=0_JYbpsax0ZAAPnOd7nFAX-29fXDpTk5t8M4hx9ytT-
> ",isURL=TRUE)
>
> But getting from there to a dataframe in which one column would be  
> the name
> of the journal and another column would be the year (to keep things  
> simple)
> seems to be beyond my capabilities.


If you're comfortable with Python (or Perl, Ruby etc), it'd be easier  
to just extract the required stuff from the raw feed - using  
ElementTree in Python makes this a trivial task

Once you have the raw data you can read it into R

-------------------------------------------------------------------
Rajarshi Guha  <[EMAIL PROTECTED]>
GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04  06F7 1BB9 E634 9B87 56EE
-------------------------------------------------------------------
A committee is a group that keeps the minutes and loses hours.
        -- Milton Berle

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Analyzing Publications from Pubmed via XML

Reply via email to