Am 20.04.2017 um 15:01 schrieb Gabriel Pessoa:
Hello.
Recently at our company we started to worry about how much memory was
being used during our PDF signing process. We are using the 1.8.13
now, mostly because the loading time on 2.0.x got longer (I actually
asked about it some six months ago and Tilman explained the reason why).
This question on StackOverflow I think cleared some doubts I had about
how PDFBox worked:
http://stackoverflow.com/questions/22340674/performance-itext-vs-pdfbox
The main point being: PDFBox parses and have ALL the objects in the
PDF loaded. So, complex objects will use a lot of memory. Am I correct?
Yes
If that is the case, I understand that is necessary for PDF
manipulation, but is that necessary for PDF signing? Looking at a
signed PDF structure it looks like only the Root entry (to update the
AcroForm entry) and the signed page entry (to update the Annots entry)
are really needed for signing.
And the acroform field tree
So I would be too wrong in suggesting a new load method that would be
used only for singing and that would only load those necessary entries
and would not load things like images and fonts and tables, etc.
If not that, something akin to "lazy loading" could be done? With the
PDF objects only being actually parsed and loaded when being accessed.
The load would only map all the references in that case.
If any on those two options is possible but you don't have anyone
currently available to work on it, I could try to develop that
solution. I would only need to know if it would be better to use the
2.0.6 branch or the 3.0.0 trunk.
Thank you very much for your time.
Andreas wrote that he's working on an on-demand parser.
Tilman
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]