I am converting a byte[] to PDDocument and had a shocking experience:
There were field values (not the fields themselves) missing. I compared
PDFBox 2.0.23 to IText.
Acroform acroform = PDDocument.load(source, password);
HashMap<String, AcroFields.Item> fields2 = (new PDFReader(source,
password)).getAcroFields().getFields(); // Fields from IText
for(PDField field: acroform.getFields()) // Fields from PDFBox
{
System.out.println("Field " + field.getFullyQualifiedName() + "
IText [" + acroform.getField(field.getFullyQualifiedName()) + "] PDFBox
[" + field.getValueAsString() + "]");
}
The result was occassionally akin to
Field KEY IText [Value] PDFBox []
I expected it to be
Field KEY IText [Value] PDFBox [Value]
. It might be, that that particular PDF has Fields with the same key,
because I did not experience that problem with other PDFs.
May I ask whether there is a known bug with PDFBox 2.0.23, that allows
for such a behaviour? How come, that PDFs created in C++ are no longer
readable in PDFBox? How can I fix the bug? I do not wish to use IText to
solve it.