Hi,

I am not familiar with the PDF specification but reading the last
comment before line 755 in pdf.py it becomes quite clear what is
happening here.

pypdf according to the PDF spec, expects at that position a line of
fixed size 20. It can additionally tolerate malformed files where the
line is one byte too short. In case of Oliver's file however, the line
to be read is 2 bytes too short. It is quite trivial to fix this. The
question is if the upstream developer would like pypdf to handle
malformed pdf files like this.
Another problem with Oliver's file is that the second field is only 4
bytes long instead of 5. 

@Oliver
try replacing:

>>if line[-1] in "0123456789t":
>>    stream.seek(-1, 1)
>>offset, generation = line[:16].split(" ")

with:

>>for c in line[-2:]:
>>    if c in "0123456789t":
>>        stream.seek(-1, 1)
>>offset, generation = line[:16].split(" ")[:2]

Best regards

Kostas

On Thu, 2011-06-09 at 19:49 +0200, Serafeim Zanikolas wrote:
> Thanks Olivier.
> 
> Konstantinos, would you please check out the report for #628891 (latest
> message shown below)? Is this a known issue with pyPdf? I'm sorry but I don't
> have time to dig into this further.
> 
> On Sun, Jun 05, 2011 at 10:52:28PM +0200, Olivier Berger wrote:
> > Le jeudi 02 juin 2011 à 12:58 +0200, Serafeim Zanikolas a écrit :
> > 
> > > - reproduce the problem
> > > - and report back the printed output
> > 
> > Here's the output, then :
> > $ pdfshuffler 
> > .0000000015 0000 n
> > 00.
> > Traceback (most recent call last):
> >   File "/usr/bin/pdfshuffler", line 417, in choose_export_pdf_name
> >     self.export_to_file(file_out)
> >   File "/usr/bin/pdfshuffler", line 438, in export_to_file
> >     pdfdoc_inp = PdfFileReader(file(pdfdoc.copyname, 'rb'))
> >   File "/usr/lib/pymodules/python2.6/pyPdf/pdf.py", line 374, in __init__
> >     self.read(stream)
> >   File "/usr/lib/pymodules/python2.6/pyPdf/pdf.py", line 755, in read
> >     offset, generation = line[:16].split(" ")
> > ValueError: too many values to unpack
> > 
> > Hope this helps.
> > 
> > Best regards,
> > -- 
> > Olivier BERGER <olivier.ber...@it-sudparis.eu>
> > http://www-public.it-sudparis.eu/~berger_o/ - OpenPGP-Id: 2048R/5819D7E8
> > Ingénieur Recherche - Dept INF
> > Institut TELECOM, SudParis (http://www.it-sudparis.eu/), Evry (France)
> > 
> > 
> 





-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to