Hi All, A while ago I lodged ANY23-134 [0] with the intention of extending the Any23 paradigm to other document formats other than subsets of XML. Say for example, I would like to read in PDF documents such as this one [1] or this one [2]. The idea would be to use Any23 (within a pipeline) to extract out the specification data as triples. I can then build a triples representation of this document for really domain specific inferences. Is ANY23-134 the correct way to go about this? Should I be looking at some other existing tool we have within Any23... XPath immediately springs to mind but I am not sure and would really appreciate a comment or two from anyone out there!
Thank you very much. Lewis [0] https://issues.apache.org/jira/browse/ANY23-134 [1] http://www.fanucrobotics.com/cmsmedia/datasheets/ARC%20Mate%20100iC%20Series_7.pdf [2] http://www.fanucrobotics.com/cmsmedia/datasheets/ARC%20Mate%200iA_170.pdf -- *Lewis*
