Here's my code. As I said, it is throwing an exception at "new
DomXmpParser()" and I have no idea why:
protected JSONObject getPdfMetadata(byte [] buffer)
throws IOException, XmpParsingException, JSONException {
ByteArrayInputStream bais = new ByteArrayInputStream(buffer);
JSONObject json = new JSONObject();
PDDocument document = null;
try {
document = PDDocument.load(bais);
PDDocumentCatalog catalog = document.getDocumentCatalog();
PDMetadata meta = catalog.getMetadata();
if (meta != null) {
DomXmpParser xmpParser = new DomXmpParser(); // throws exception
XMPMetadata metadata = xmpParser.parse(meta.createInputStream());
DublinCoreSchema dc = metadata.getDublinCoreSchema();
if (dc != null) {
JSONObject dcj = new JSONObject();
dcj.put("Title", dc.getTitle());
dcj.put("Description", dc.getDescription());
...
json.put("Dublin", dcj);
}
...
My goal is to return a JSON formatted string to a browser, and display the
fomatted metadata to the user. So for now I'm getting around this
DomXmpParser exception from DomXmpParser by simply converting the metadata
to JSON with JSON-java (https://github.com/stleary/JSON-java), and
untangling the namespace, etc. on browser side:
PDMetadata meta = catalog.getMetadata();
if (meta != null) {
InputStream is = meta.exportXMPMetadata();
ByteArrayOutputStream baos = new ByteArrayOutputStream();
int read = 0;
byte [] bytes = new byte[8*1024];
while ((read = is.read(bytes)) != -1) {
baos.write(bytes, 0, read);
}
String string = new String(baos.toByteArray());
json = XML.toJSONObject(string);
...
On Wed, Mar 8, 2017 at 10:11 PM, Thad Humphries <[email protected]>
wrote:
> When I run the org.apache.pdfbox.examples.pdmodel.ExtractMetadata
> example, it works. However when I put the same code into my class, it
> throws an exception when I call "DomXmpParser xmpParser = new
> DomXmpParser();" The trace is:
>
> java.lang.AbstractMethodError: javax.xml.parsers.DocumentBuilderFactory.
> setFeature(Ljava/lang/String;Z)V
> at org.apache.xmpbox.xml.DomXmpParser.<init>(DomXmpParser.java:81)
> at com.jthad.util.image.MetadataExtractor.getPdfMetadata(
> MetadataExtractor.java:170)
> at com. jthad.util.image.TestMetadataExtractor.testPdf0(
> TestMetadataExtractor.java:41)
> ...
>
> Line 81 in DomXmpParser.java is
>
> dbFactory.setFeature("http://apache.org/xml/features/disallow-doctype-decl",
> true);
>
> I am at a loss to understand how "new DomXmpParser()" works from the
> command line but fails when called by a JUnit test in Eclipse.
>
> --
> "Hell hath no limits, nor is circumscrib'd In one self-place; but where we
> are is hell, And where hell is, there must we ever be" --Christopher
> Marlowe, *Doctor Faustus* (v. 121-24)
>
--
"Hell hath no limits, nor is circumscrib'd In one self-place; but where we
are is hell, And where hell is, there must we ever be" --Christopher
Marlowe, *Doctor Faustus* (v. 121-24)