I'm not sure it is possible to parse an XML file in R directly. Well, I guess it's *possible*, but may not be the best way to do it. ElementTree in Python is an easy-to-use parser that you might use to first parse your XML file (or others hierarchically structured data), organize it anyway you want, and then bring those data into R for subsequent analysis.
In fact, I have recently done just this. I have another statistical program that outputs data as an XML file. So, I wrote a python program that parses that XML file, pulls out the data of interest into a text file, and then I bring those data into R for analysis. > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Keith Alan > Chamberlain > Sent: Thursday, April 10, 2008 4:14 PM > To: r-help@r-project.org > Subject: [R] Relational Databases or XML? > > Dear R-Help, > > I am working on a paper in an R course for large file support > in R using scan(), relational databases, and XML. I have > never used SQL or heirarchical document formats such as XML > (except where it occurs without user interaction), and > knowledge in RDBs and XML is lacking in my program. I have > tried finding a working example for the novices-novice on the > topic, read many postings, the r-data I/O manual several > times, and descriptions of packages RODBC, DBI, XML, among > others. I understand that RDBs are (assumed at least) used > widely among the R community. I have not been able to put all > of the pieces together, but assuming that RDB use is actually > quite widespread, it should be quite easy to fill me in > and/or correct my understanding where necessary. > > For a cross-platform solution (PC/OSX at least, or in part) > my questions/problems are about what preliminary steps are > needed to get an SQL or XML query "to work" in R to begin > with, what the appropriate data-file formats are, and how to > convert to them if starting out with data in, say, a > delimited ASCII text file. Very basic examples should > suffice, say, a table with 20 random observations, a grouping > variable with 2 levels, and a factor with 2 levels. > > ## untested code > set.seed(1024) > write.table("junk.txt", > data.frame(Subj=c(rep(1,10),rep(2,10)),block=rep(c(rep(-1,5),r > ep(1,5)),2), obs=rnorm(20,0,1))) > > Specifically, > > 1- what are the minimum required non R components that are > needed to support SQL or XML functionality, which may or may > not need to be installed? > > 2- what R packages need to be installed, at a minimum (also > as a cross-PC/Mac solution if possible or at least as much as > possible) > > 3- I keep seeing reference to connections of a given name "if > previously setup". What kind of setup is needed outside of R, if any? > > 4- what steps are needed in R to then connect to a file and > import a subset based on a query? > > 5- Do I then use standard R routines (e.g. write()) to export > as a DB, or an RDB/XML specific function? > > Sincerely, > KeithC. [U.S] > > 1/k^c > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.