Hi Ben, Thanks for info. That is definitely a viable solution for the example I provided. It is more common that I have larger files with more edits to make.
The main reason I went with a data frame methodology is becuase of the file types I have. Essentially, I have an XML 50+ "rows" by 10 "columns" of data in a similar form to the provided example. Seperately, I have an .xlsx file that contains 3 to 6 columns of data that must replace a particular "column" of the data in the XML file. So my approach was to have one XML data frame, one .xlsx data frame, combine them as necessary, and output a final XML format with updated data. Apparently, it wasn't quite as trivial of a problem as I was hoping. On Wed, Jan 23, 2013 at 8:09 PM, Ben Tupper <btup...@bigelow.org> wrote: > Hi Adam, > > On Jan 23, 2013, at 11:36 AM, Adam Gabbert wrote: > > Hello Gentlemen, > > I mistakenly sent the message twice, because the first time I didn't > receive a notification message so I was unsure if it went through properly. > > Your solutions worked great. Thank you! I felt like I was fairly close > just couldn't quite get the final step. > > Now, I'm trying to reverse the process and account for my header. > > In other words I have my data frame in R: > > BRAND NUM YEAR VALUE > GMC 1 1999 10000 > FORD 2 2000 12000 > GMC 1 2001 12500 > etc........ > and I make some edits. > BRAND NUM YEAR VALUE > DODGE 3 1999 10000 > TOYOTA 4 2000 12000 > DODGE 3 2001 12500 > > > You needn't transform to a data frame if all you need to do is tweak the > values of some of the attributes. You can always set the attributes of > each row node directly. > > s <- c(" <data>", " <row BRAND=\"GMC\" NUM=\"1\" YEAR=\"1999\" > VALUE=\"10000\" />", > " <row BRAND=\"FORD\" NUM=\"1\" YEAR=\"2000\" VALUE=\"12000\" />", > " <row BRAND=\"GMC\" NUM=\"1\" YEAR=\"2001\" VALUE=\"12500\" />", > " <row BRAND=\"GMC\" NUM=\"1\" YEAR=\"2008\" VALUE=\"22000\" />", > " </data>") > > x <- xmlRoot(xmlTreeParse(s, asText = TRUE, useInternalNodes = TRUE)) > > node <- x["row"][[1]] > node > xmlAttrs(node) <- c(BRAND = "BUICK", NUM = "3", YEAR = "2000", VALUE = "0") > node > x > > > > So now I would need to ouput an XML file in the same format accounting > for my header (essentially, add "z:" in front of row). > > > > I think that what you're describing is a namespace identifier. Check the > XML package help for ?xmlNamespace In particular check this example on the > help page. > > node <- xmlNode("arg", xmlNode("name", "foo"), namespace="R") > xmlNamespace(node) > > > > Cheers, > Ben > > (What I want to output) > > <data> > > <z:row BRAND="DODGE" NUM="3" YEAR="1999" VALUE="10000" /> > > <z:row BRAND="TOYOTA" NUM="4" YEAR="2000" VALUE="12000" /> > > <z:row BRAND="DODGE" NUM="3" YEAR="2001" VALUE="12500" /> > > <z:row BRAND="TOYOTA" NUM="4" YEAR="2002" VALUE="13000" /> > > <z:row BRAND="DODGE" NUM="3" YEAR="2003" VALUE="14000" /> > > <z:row BRAND="TOYOTA" NUM="4" YEAR="2004" VALUE="17000" /> > > <z:row BRAND="DODGE" NUM="3" YEAR="2005" VALUE="15000" /> > > <z:row BRAND="DODGE" NUM="3" YEAR="1967" VALUE="PRICELESS" /> > > <z:row BRAND="TOYOTA" NUM="4" YEAR="2007" VALUE="17500" /> > > <z:row BRAND="DODGE" NUM="3" YEAR="2008" VALUE="22000" /> > > </data> > Thus far from the help I've found online I was trying to set up an xmlTree > xml <- xmlTree() > > and use xml$addTag to create nodes and put in the data from my data > frame. I feel like I'm not really even close to a solution so I'm starting > to believe that this might not be the best path to go down. > > Once again, any help is much appreciated. > > AG > > > On Tue, Jan 22, 2013 at 6:04 PM, Duncan Temple Lang < > dtemplel...@ucdavis.edu> wrote: > >> >> Hi Adam >> >> [You seem to have sent the same message twice to the mailing list.] >> >> There are various strategies/approaches to creating the data frame >> from the XML. >> >> Perhaps the approach that most closely follows your approach is >> >> xmlRoot(doc)[ "row" ] >> >> which returns a list of XML nodes whose node name is "row" that are >> children of the root node <data>. >> >> So >> sapply(xmlRoot(doc) [ "row" ], xmlAttrs) >> >> yields a matrix with as many columns as there are <row> nodes >> and with 3 rows - one for each of the BRAND, YEAR and VALUE attributes. >> >> So >> >> d = t( sapply(xmlRoot(doc) [ "row" ], xmlAttrs) ) >> >> gives you a matrix with the correct rows and column orientation >> and now you can turn that into a data frame, converting the >> columns into numbers, etc. as you want with regular R commands >> (i.e. independently of the XML). >> >> >> D. >> >> On 1/22/13 1:43 PM, Adam Gabbert wrote: >> > Hello, >> > >> > I'm attempting to read information from an XML into a data frame in R >> using >> > the "XML" package. I am unable to get the data into a data frame as I >> would >> > like. I have some sample code below. >> > >> > *XML Code:* >> > >> > Header... >> > >> > Data I want in a data frame: >> > >> > <data> >> > <row BRAND="GMC" NUM="1" YEAR="1999" VALUE="10000" /> >> > <row BRAND="FORD" NUM="1" YEAR="2000" VALUE="12000" /> >> > <row BRAND="GMC" NUM="1" YEAR="2001" VALUE="12500" /> >> > <row BRAND="FORD" NUM="1" YEAR="2002" VALUE="13000" /> >> > <row BRAND="GMC" NUM="1" YEAR="2003" VALUE="14000" /> >> > <row BRAND="FORD" NUM="1" YEAR="2004" VALUE="17000" /> >> > <row BRAND="GMC" NUM="1" YEAR="2005" VALUE="15000" /> >> > <row BRAND="GMC" NUM="1" YEAR="1967" VALUE="PRICLESS" /> >> > <row BRAND="FORD" NUM="1" YEAR="2007" VALUE="17500" /> >> > <row BRAND="GMC" NUM="1" YEAR="2008" VALUE="22000" /> >> > </data> >> > >> > *R Code:* >> > >> > doc< -xmlInternalTreeParse ("Sample2.xml") >> > top <- xmlRoot (doc) >> > xmlName (top) >> > names (top) >> > art <- top [["row"]] >> > art >> > ** >> > *Output:* >> > >> >> art<row BRAND="GMC" NUM="1" YEAR="1999" VALUE="10000"/> >> > >> > * * >> > >> > >> > This is where I am having difficulties. I am unable to "access" >> additional >> > rows; ( i.e. <row BRAND="GMC" NUM="1" YEAR="1967" VALUE="PRICLESS" /> ) >> > >> > and I am unable to access the individual entries to actually create the >> > data frame. The data frame I would like is as follows: >> > >> > BRAND NUM YEAR VALUE >> > GMC 1 1999 10000 >> > FORD 2 2000 12000 >> > GMC 1 2001 12500 >> > etc........ >> > >> > Any help or suggestions would be appreciated. Conversly, my eventual >> goal >> > would be to take a data frame and write it into an XML in the previously >> > shown format. >> > >> > Thank you >> > >> > AG >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> >> > and provide commented, minimal, self-contained, reproducible code. >> > >> > >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> >> and provide commented, minimal, self-contained, reproducible code. >> > > > Ben Tupper > Bigelow Laboratory for Ocean Sciences > 180 McKown Point Rd. P.O. Box 475 > West Boothbay Harbor, Maine 04575-0475 > http://www.bigelow.org > > > > > > > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.