Hi Stavros
xmlToDataFrame() is very generic and so doesn't know anything
about the particulars of the XML it is processing. If you know
something about the structure of the XML, you should be able to leverage that
for performance.
xmlToDataFrame is also not optimized as it is just a convenience
I have a modest-size XML file (52MB) in a format suited to xmlToDataFrame
(package XML).
I have successfully read it into R by splitting the file 10 ways then
running xmlToDataFrame on each part, then rbind.fill (package plyr) on the
result. This takes about 530 s total, and results in a data.fram
2 matches
Mail list logo