Re: ExtractMetadata error

Thad Humphries Thu, 09 Mar 2017 10:33:24 -0800

Yes, I can take a stab at that in a few days, after the crunch of my
current project abates. I'll let you know when it's on GitHub. Thanks.


On Thu, Mar 9, 2017 at 12:43 PM, Tilman Hausherr <[email protected]>
wrote:

> Can you create a minimal but fully working project with maven? I.e. we'd
> need code with main, and a pom. I mention this because an additional lib is
> needed, unless I misunderstood.
>
> Tilman
>
>
> Am 09.03.2017 um 16:51 schrieb Thad Humphries:
>
>> Here's my code. As I said, it is throwing an exception at "new
>> DomXmpParser()" and I have no idea why:
>>
>>    protected JSONObject getPdfMetadata(byte [] buffer)
>>        throws IOException, XmpParsingException, JSONException {
>>      ByteArrayInputStream bais = new ByteArrayInputStream(buffer);
>>
>>      JSONObject json = new JSONObject();
>>      PDDocument document = null;
>>      try {
>>        document = PDDocument.load(bais);
>>        PDDocumentCatalog catalog = document.getDocumentCatalog();
>>        PDMetadata meta = catalog.getMetadata();
>>
>>        if (meta != null) {
>>          DomXmpParser xmpParser = new DomXmpParser();  // throws exception
>>          XMPMetadata metadata = xmpParser.parse(meta.createInp
>> utStream());
>>
>>          DublinCoreSchema dc = metadata.getDublinCoreSchema();
>>          if (dc != null) {
>>            JSONObject dcj = new JSONObject();
>>            dcj.put("Title", dc.getTitle());
>>            dcj.put("Description", dc.getDescription());
>>            ...
>>            json.put("Dublin", dcj);
>>          }
>>    ...
>>
>> My goal is to return a JSON formatted string to a browser, and display the
>> fomatted metadata to the user. So for now I'm getting around this
>> DomXmpParser exception from DomXmpParser by simply converting the metadata
>> to JSON with JSON-java (https://github.com/stleary/JSON-java), and
>> untangling the namespace, etc. on browser side:
>>
>>      PDMetadata meta = catalog.getMetadata();
>>
>>        if (meta != null) {
>>          InputStream is = meta.exportXMPMetadata();
>>          ByteArrayOutputStream baos = new ByteArrayOutputStream();
>>          int read = 0;
>>          byte [] bytes = new byte[8*1024];
>>          while ((read = is.read(bytes)) != -1) {
>>            baos.write(bytes, 0, read);
>>          }
>>          String string = new String(baos.toByteArray());
>>          json = XML.toJSONObject(string);
>>      ...
>>
>>
>> On Wed, Mar 8, 2017 at 10:11 PM, Thad Humphries <[email protected]
>> >
>> wrote:
>>
>> When I run the org.apache.pdfbox.examples.pdmodel.ExtractMetadata
>>> example, it works. However when I put the same code into my class, it
>>> throws an exception when I call "DomXmpParser xmpParser = new
>>> DomXmpParser();"  The trace is:
>>>
>>> java.lang.AbstractMethodError: javax.xml.parsers.DocumentBuilderFactory.
>>> setFeature(Ljava/lang/String;Z)V
>>> at org.apache.xmpbox.xml.DomXmpParser.<init>(DomXmpParser.java:81)
>>> at com.jthad.util.image.MetadataExtractor.getPdfMetadata(
>>> MetadataExtractor.java:170)
>>> at com. jthad.util.image.TestMetadataExtractor.testPdf0(
>>> TestMetadataExtractor.java:41)
>>> ...
>>>
>>> Line 81 in DomXmpParser.java is
>>>
>>> dbFactory.setFeature("http://apache.org/xml/features/disallo
>>> w-doctype-decl",
>>> true);
>>>
>>> I am at a loss to understand how "new DomXmpParser()" works from the
>>> command line but fails when called by a JUnit test in Eclipse.
>>> ...
>>>
>> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>

-- 
"Hell hath no limits, nor is circumscrib'd In one self-place; but where we
are is hell, And where hell is, there must we ever be" --Christopher
Marlowe, *Doctor Faustus* (v. 121-24)

Re: ExtractMetadata error

Reply via email to