Duncan Temple Lang wrote: > > > Wacek Kusnierczyk wrote: >> Don MacQueen wrote: >>> I have an XML file that has within it the coordinates of some polygons >>> that I would like to extract and use in R. The polygons are nested >>> rather deeply. For example, I found by trial and error that I can >>> extract the coordinates of one of them using functions from the XML >>> package: >>> >>> doc <- xmlInternalTreeParse('doc.kml') >>> docroot <- xmlRoot(doc) >>> pgon <- >> >> try >> >> lapply( >> xpathSApply(doc, '//Polygon', >> xpathSApply, '//coordinates', function(node) >> strsplit(xmlValue(node), split=',|\\s+')), >> as.numeric) > > > Just for the record, I the xpath expression in the > second xpathSApply would need to be > ".//coordinates" > to start searching from the previously matched Polygon node. > Otherwise, the search starts from the top of the document again. >
not really: the xpath pattern '//coordinates' does say 'find all coordinates nodes searching from the root', but the root here is not the original root of the whole document, but each polygon node in turn. try: root = xmlInternalTreeParse(' <root> <foo> <bar>1</bar> </foo> <foo> <bar>2</bar> </foo> </root>') xpathApply(root, '//foo', xpathSApply, '//bar', xmlValue) # equals list("1", "2"), not list(c("1", "2"), c("1", "2")) this is not equivalent to xpathApply(root, '//foo', function(foo) xpathSApply(root, '//bar', xmlValue)) but to xpathApply(root, '//foo', function(foo) xpathSApply(foo, '//bar', xmlValue)) as the author of the XML package, you should know ;) > However, it would seem that > > xpathSApply(doc, "//Polygon//coordinates", > function(node) strsplit(.....)) > > would be more direct, i.e. fetch the coordinates nodes in single > XPath expression. yes, in this case it would; i was not sure about the concrete schema. i copied the code from my solution to some other problem, where polygon would have multiple coordinates nodes which would have to be merged in some way for each polygon separately -- your solution would return the content of each coordinates nodes separately irrespectively of whether it is unique within the polygon (which might well be in this particular case, and thus your solution is undeniably more elegant). vQ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.