Hi Ben,

Thanks for info. That is definitely a viable solution for the example I
provided.  It is more common that I have larger files with more edits to
make.

The main reason I went with a data frame methodology is becuase of the file
types I have.  Essentially, I have an XML 50+ "rows" by 10 "columns" of
data in a similar form to the provided example.  Seperately, I have an
.xlsx file that contains 3 to 6 columns of data that must replace a
particular "column" of the data in the XML file.  So my approach was
to have one XML data frame, one .xlsx data frame, combine them as
necessary, and output a final XML format with updated data. Apparently, it
wasn't quite as trivial of a problem as I was hoping.


On Wed, Jan 23, 2013 at 8:09 PM, Ben Tupper <btup...@bigelow.org> wrote:

> Hi Adam,
>
>  On Jan 23, 2013, at 11:36 AM, Adam Gabbert wrote:
>
>  Hello Gentlemen,
>
> I mistakenly sent the message twice, because the first time I didn't
> receive a notification message so I was unsure if it went through properly.
>
> Your solutions worked great. Thank you!  I felt like I was fairly close
> just couldn't quite get the final step.
>
> Now, I'm trying to reverse the process and account for my header.
>
> In other words I have my data frame in R:
>
> BRAND    NUM    YEAR    VALUE
> GMC        1          1999      10000
> FORD       2          2000      12000
> GMC        1          2001       12500
>      etc........
> and I make some edits.
> BRAND    NUM    YEAR    VALUE
> DODGE       3          1999      10000
> TOYOTA       4         2000      12000
> DODGE        3          2001       12500
>
>
> You needn't transform to a data frame if all you need to do is tweak the
> values of some of the attributes.  You can always set the attributes of
> each row node directly.
>
>  s <- c("  <data>", " <row BRAND=\"GMC\" NUM=\"1\" YEAR=\"1999\"
> VALUE=\"10000\" />",
> " <row BRAND=\"FORD\" NUM=\"1\" YEAR=\"2000\" VALUE=\"12000\" />",
> " <row BRAND=\"GMC\" NUM=\"1\" YEAR=\"2001\" VALUE=\"12500\" />",
> " <row BRAND=\"GMC\" NUM=\"1\" YEAR=\"2008\" VALUE=\"22000\" />",
> " </data>")
>
> x <- xmlRoot(xmlTreeParse(s, asText = TRUE, useInternalNodes = TRUE))
>
> node <- x["row"][[1]]
> node
> xmlAttrs(node) <- c(BRAND = "BUICK", NUM = "3", YEAR = "2000", VALUE = "0")
> node
> x
>
>
>
>  So now I would need to ouput an XML file in the same format accounting
> for my header (essentially, add "z:" in front of row).
>
>
>
> I think that what you're describing is a namespace identifier.  Check the
> XML package help for ?xmlNamespace  In particular check this example on the
> help page.
>
>   node <- xmlNode("arg", xmlNode("name", "foo"), namespace="R")
>   xmlNamespace(node)
>
>
>
> Cheers,
> Ben
>
>  (What I want to output)
> >   <data>
> >   <z:row BRAND="DODGE" NUM="3" YEAR="1999" VALUE="10000" />
> >   <z:row BRAND="TOYOTA" NUM="4" YEAR="2000" VALUE="12000" />
> >   <z:row BRAND="DODGE" NUM="3" YEAR="2001" VALUE="12500" />
> >   <z:row BRAND="TOYOTA" NUM="4" YEAR="2002" VALUE="13000" />
> >   <z:row BRAND="DODGE" NUM="3" YEAR="2003" VALUE="14000" />
> >   <z:row BRAND="TOYOTA" NUM="4" YEAR="2004" VALUE="17000" />
> >   <z:row BRAND="DODGE" NUM="3" YEAR="2005" VALUE="15000" />
> >   <z:row BRAND="DODGE" NUM="3" YEAR="1967" VALUE="PRICELESS" />
> >   <z:row BRAND="TOYOTA" NUM="4" YEAR="2007" VALUE="17500" />
> >   <z:row BRAND="DODGE" NUM="3" YEAR="2008" VALUE="22000" />
> >   </data>
> Thus far from the help I've found online I was trying to set up an xmlTree
> xml <- xmlTree()
>
> and use xml$addTag to create nodes and put in the data from my data
> frame.  I feel like I'm not really even close to a solution so I'm starting
> to believe that this might not be the best path to go down.
>
> Once again, any help is much appreciated.
>
> AG
>
>
> On Tue, Jan 22, 2013 at 6:04 PM, Duncan Temple Lang <
> dtemplel...@ucdavis.edu> wrote:
>
>>
>> Hi Adam
>>
>>  [You seem to have sent the same message twice to the mailing list.]
>>
>> There are various strategies/approaches to creating the data frame
>> from the XML.
>>
>> Perhaps the approach that most closely follows your approach is
>>
>>   xmlRoot(doc)[ "row" ]
>>
>> which  returns a list of XML nodes whose node name is "row" that are
>> children of the root node <data>.
>>
>> So
>>   sapply(xmlRoot(doc) [ "row" ], xmlAttrs)
>>
>> yields a matrix with as many columns as there are  <row> nodes
>> and with 3 rows - one for each of the BRAND, YEAR and VALUE attributes.
>>
>> So
>>
>>   d = t( sapply(xmlRoot(doc) [ "row" ], xmlAttrs) )
>>
>> gives you a matrix with the correct rows and column orientation
>> and now you can turn that into a data frame, converting the
>> columns into numbers, etc. as you want with regular R commands
>> (i.e. independently of the XML).
>>
>>
>>  D.
>>
>> On 1/22/13 1:43 PM, Adam Gabbert wrote:
>> >  Hello,
>> >
>> > I'm attempting to read information from an XML into a data frame in R
>> using
>> > the "XML" package. I am unable to get the data into a data frame as I
>> would
>> > like.  I have some sample code below.
>> >
>> > *XML Code:*
>> >
>> > Header...
>> >
>> > Data I want in a data frame:
>> >
>> >    <data>
>> >   <row BRAND="GMC" NUM="1" YEAR="1999" VALUE="10000" />
>> >   <row BRAND="FORD" NUM="1" YEAR="2000" VALUE="12000" />
>> >   <row BRAND="GMC" NUM="1" YEAR="2001" VALUE="12500" />
>> >   <row BRAND="FORD" NUM="1" YEAR="2002" VALUE="13000" />
>> >   <row BRAND="GMC" NUM="1" YEAR="2003" VALUE="14000" />
>> >   <row BRAND="FORD" NUM="1" YEAR="2004" VALUE="17000" />
>> >   <row BRAND="GMC" NUM="1" YEAR="2005" VALUE="15000" />
>> >   <row BRAND="GMC" NUM="1" YEAR="1967" VALUE="PRICLESS" />
>> >   <row BRAND="FORD" NUM="1" YEAR="2007" VALUE="17500" />
>> >   <row BRAND="GMC" NUM="1" YEAR="2008" VALUE="22000" />
>> >   </data>
>> >
>> > *R Code:*
>> >
>> > doc< -xmlInternalTreeParse ("Sample2.xml")
>> > top <- xmlRoot (doc)
>> > xmlName (top)
>> > names (top)
>> > art <- top [["row"]]
>> > art
>> > **
>> > *Output:*
>> >
>> >> art<row BRAND="GMC" NUM="1" YEAR="1999" VALUE="10000"/>
>> >
>> > * *
>> >
>> >
>> > This is where I am having difficulties.  I am unable to "access"
>> additional
>> > rows; ( i.e.  <row BRAND="GMC" NUM="1" YEAR="1967" VALUE="PRICLESS" /> )
>> >
>> > and I am unable to access the individual entries to actually create the
>> > data frame.  The data frame I would like is as follows:
>> >
>> > BRAND    NUM    YEAR    VALUE
>> > GMC        1          1999      10000
>> > FORD       2          2000      12000
>> > GMC        1          2001       12500
>> >     etc........
>> >
>> > Any help or suggestions would be appreciated.  Conversly, my eventual
>> goal
>> > would be to take a data frame and write it into an XML in the previously
>> > shown format.
>> >
>> > Thank you
>> >
>> > AG
>> >
>> >       [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>> >
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>    Ben Tupper
>  Bigelow Laboratory for Ocean Sciences
> 180 McKown Point Rd. P.O. Box 475
> West Boothbay Harbor, Maine   04575-0475
> http://www.bigelow.org
>
>
>
>
>
>
>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to