Hi all,

I have an HTTP endpoint returning unencrypted PDFs, with no master or user
password set. My task is to perform two simple customizations:

1) set pre-defined access permissions

2) encrypt with pre-defined master and user password

 and then deliver the file to a variety of locations, SMTP, FTP, or HTTP
destinations.

I want to minimize JVM heap size and am considering the tempFile option for
the scratch file.

But on second thought, I feel that it should be possible to do this
"in-flight". Having read the PDF spec a bit, the inputs to the encryption
algorithm seem to be the passwords (known in advance) and the file-id (I'm
ok with discarding what's going to come in the trailer and pre-produce
something out of known business identifiers)

It should be possible to wrap the http response body InputStream with a
custom CipherInputStream that begins encrypting the doc immediately as the
bytes start coming into a buffer. In addition, the CipherInputStream would
need to detect and perform two things towards the end of the response
stream - replace the file id bytes with my own and replace the access
permission integer to my desired value.

This CipherInputStream can then be provided as input to a JavaMail or FTP
or HTTPClient api and voila, I've performed customization without ever
loading any PDF into memory entirely, not counting a small buffer. (I
imagine the buffer length is constrained by the block size of the
encryption)

My question is -- does this sound feasible? Or is there some non-linearity
in the PDF structure that will force me to load the whole file despite the
modest and well-defined updates I need to make?

Would appreciate any suggestions or advice,

Best,

Ankit

Reply via email to