I am converting a byte[] to PDDocument and had a shocking experience:
There were field values (not the fields themselves) missing. I compared
PDFBox 2.0.23 to IText.

Acroform acroform = PDDocument.load(source, password);

HashMap<String, AcroFields.Item> fields2 = (new PDFReader(source,
password)).getAcroFields().getFields(); // Fields from IText

for(PDField field: acroform.getFields()) // Fields from PDFBox

{

    System.out.println("Field " + field.getFullyQualifiedName() + "
IText [" + acroform.getField(field.getFullyQualifiedName()) + "] PDFBox
[" + field.getValueAsString() + "]");

}

The result was occassionally akin to

                Field KEY IText [Value] PDFBox []

I expected it to be

                Field KEY IText [Value] PDFBox  [Value]

. It might be, that that particular PDF has Fields with the same key,
because I did not experience that problem with other PDFs.

May I ask whether there is a known bug with PDFBox 2.0.23, that allows
for such a behaviour? How come, that PDFs created in C++ are no longer
readable in PDFBox? How can I fix the bug? I do not wish to use IText to
solve it.



Reply via email to