Hi Peter, I've checked all critical locations org.apache.pdfbox.filter.FlateFilter and provided a patch.
Thanks you for your help. BR Andreas [email protected] schrieb: > I forgot to add the number of bytes available in the variable mayRead to > the where statement, in the earlier message. Version 2 is below. > > > int mayRead=compressedData.available(); // pjl > while ((mayRead > 0 && > (amountRead = decompressor.read(buffer, 0, > Math.min(mayRead,BUFFER_SIZE))) != -1)) > > -----Original Message----- > From: Lenahan, Peter > Sent: Friday, January 16, 2009 10:26 AM > To: [email protected] > Subject: RE: java.io.EOFException: Unexpected end of ZLIB input stream > error message on UNIX box > > I did a Google search on your issue. There are a couple of solutions. > InflaterInputStream read Unexpected end of ZLIB > It came up with: Results 1 - 10 of about 854 > > http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4040920 > > Work Around > The workaround is to never attempt to read more bytes than the entry > contains. Call ZipEntry.getSize() to get the actual size of the entry, > then use this value to keep track of the number of bytes remaining in > the entry while reading from it. To take the previous example: > > This code change may solve the issue for PDFBox. > > at org.pdfbox.filter.FlateFilter.decode(FlateFilter.java:97) > Add the Math.min() to reduce the number of bytes you are trying to read. > > int mayRead=compressedData.available(); > while ((amountRead = decompressor.read(buffer, 0, > Math.min(mayRead,BUFFER_SIZE))) != -1) > > > > I found another potential issue like this with a solution on the Sun > site. > It was described using windows, but the same could happen on UNIX. > It suggests that the issue could happen if you are running several > processes against the same directory. Please look this over to see if > this is the problem. Are you running multiple processes to accomplish > the job faster? > > http://forums.sun.com/thread.jspa?threadID=5316308 > > paul.miner > Posts:2,639 > Registered: 10/8/07 > Re: Unexpected end of ZLIB input stream error while compiling > Jul 22, 2008 6:54 AM (reply 1 of 2) (In reply to original post ) > > koko191 wrote: > Main batch : > start /B %SWIFT_LOCAL_HOME%\scripts\rmicAll.bat > start /B %SWIFT_LOCAL_HOME%\scripts\create_jar.bat > > The "start" command does not wait for the command to finish, so both > those batch files would be running in parallel. If they both work on the > same jar, this could be a problem. > > If you want to run the batch files in sequence, use "call". > > -----Original Message----- > From: Balasubramaniam, Balaji > [mailto:[email protected]] > Sent: Tuesday, January 13, 2009 7:05 PM > To: [email protected] > Subject: java.io.EOFException: Unexpected end of ZLIB input stream error > message on UNIX box > > Hello, > > > > I'm trying to use PdfBox to identify a PDF file is corrupted or not. We > are > trying to automate a process in which it is going to loop through a > given > folder and see how many of the PDF files are corrupted. This program > works > fine in windows XP environment (OS Version: x86 Windows XP 5.1, Java > version > : Java HotSpot(tm) Client VM 1.5.0-15-b04). When we ran this application > in > UNIX box (OS Version: PA_RISC2.0 HP-UX B.11.23, Java Version: Java > HotSpot(tm) Client VM 1.5.0.11 jinteg:11.07.07-09:52 PA2.0(aCC_AP)) it > throws > the following error. > > > > NOTE: This error is not happening for all the time. It throws the error > only > for some of the PDF files. Those PDF files are not corrupted and I could > open > those PDF files manually and it opens fine. > > > > java.io.EOFException: Unexpected end of ZLIB input stream > > at > java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:216) > > at > java.util.zip.InflaterInputStream.read(InflaterInputStream.java:134) > > at org.pdfbox.filter.FlateFilter.decode(FlateFilter.java:97) > > at org.pdfbox.cos.COSStream.doDecode(COSStream.java:290) > > at org.pdfbox.cos.COSStream.doDecode(COSStream.java:235) > > at > org.pdfbox.cos.COSStream.getUnfilteredStream(COSStream.java:170) > > at > org.pdfbox.pdmodel.common.COSStreamArray.getUnfilteredStream(COSStreamAr > ray.j > ava:200) > > at > org.pdfbox.pdfparser.PDFStreamParser.<init>(PDFStreamParser.java:101) > > at > ProcessDefinitions.RunAuditProcess.RunAuditProcessGenerateAuditLogMessag > e.inv > oke(RunAuditProcessGenerateAuditLogMessage.java:212) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav > a:39) > > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor > Impl. > java:25) > > at java.lang.reflect.Method.invoke(Method.java:585) > > at > com.tibco.plugin.java.JavaActivity.eval(JavaActivity.java:383) > > at com.tibco.pe.plugin.Activity.eval(Activity.java:209) > > at com.tibco.pe.core.TaskImpl.eval(TaskImpl.java:540) > > at com.tibco.pe.core.Job.a(Job.java:712) > > at com.tibco.pe.core.Job.k(Job.java:501) > > at > com.tibco.pe.core.JobDispatcher$JobCourier.a(JobDispatcher.java:249) > > at > com.tibco.pe.core.JobDispatcher$JobCourier.run(JobDispatcher.java:200) > > > > Sample code snippet I use to do the task. > > > > PDDocument document = PDDocument.load(<input stream>); > > List pages = document.getDocumentCatalog().getAllPages(); > > If(pages != null && pages.size() > 0) { > > PDPage page = (PDPage)pages.get(i); > > PDStream contents = page.getContents(); > > PDFStreamParser parser = null; > > try { > > parser = new PDFStreamParser(contents.getStream()); > > } catch(Exception e) { > > System.err.println("This PDF cannot be read. Most possibly it could > be > corrupted. " + pdfFileName); > > } > > } > > > > Could somebody shed some light on this one? > > > > Thank you. > > -- Auf der Verpackung stand "benötigt Windows 9x/2000/XP oder BESSER", also habe ich Linux installiert.
