utils/pdfunite.cc opens its input files with PDFDoc *doc = new 
PDFDoc(gfileName, NULL, NULL, NULL)
poppler/PDFDoc.h also provides PDFDoc(BaseStream *strA, GooString 
*ownerPassword = NULL, GooString *userPassword = NULL, void *guiDataA = NULL)
poppler/Stream.h provides MemStream(char *bufA, Goffset startA, Goffset 
lengthA, Object &&dictA) that you could probably use like MemStream *mStream = 
new MemStream(s->getCString(), 0, s->getLength(), Object(objNull))
So if you are lucky, you can make a MemStream for each in-memory PDF, then make 
a PDFDoc for each MemStream, and then cut-and-paste the code in pdfunite.cc 
that combines the PDFDoc objects.
Running "pdfunite <(cat a.pdf) <(cat b.pdf) ab.pdf" from bash fails with 
"Syntax Error: Document stream is empty" "Syntax Error: Could not merge damaged 
documents ('/dev/fd/63')", so PDFDoc might require input that is seekable, so 
if you are using std::istream, if the underlying data is from a stringstream, 
it might work, but if it is from an fstream, you might have to read it all into 
a buffer.
William

________________________________
From: poppler <[email protected]> on behalf of Pierre 
Couderc <[email protected]>
Sent: Saturday, July 17, 2021 5:33 PM
To: [email protected] <[email protected]>
Subject: Re: [poppler] How to "pdfunite" in memory...?

On 7/17/21 8:43 PM, Oliver Sander wrote:
>> I do not understand well your question. But I know that a pdf
>> document contains pages.
>>
>> I have pdf documents in memory (read from a database) and I need to
>> merge these documents in memory to write them back in a database...
>
> You need to give a few more details about what you mean by "I have pdf
> documents in memory".
> Does that mean that you simply copied the file content to some
> allocated memory?  Or have
> you opened these pdf files using poppler (using code like in
> poppler/qt5/demos)?
>
> You need to do the latter to solve your problem.  Open the files using
> poppler,
> and then copy code that unites them from pdfunite.cc (licences
> permitting).
>
> Best,
> Oliver
>
Sorry to not be clear : I upload pdf documents (with a c++ cppcms
server), I get them in some std::istream, I need to manipulate pages of
these documents, create new  documents from these pages, store these
documents in bytea postgresql db, extract text from them, retrieve them
when a user ask to download them...

poppler can make the job, but I do not need and I would like to avoid to
use files to do all that...

So my question : what is the best strategy ?

I have no license problem, all is open source.

Thank you

PX.


_______________________________________________
poppler mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/poppler
_______________________________________________
poppler mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/poppler

Reply via email to