We are converting PDF files into images and the way we are doing it is breaking a single PDF files into several PDDocument, one per page, and converting them in parallel.
What I found is for pages with more objects, the processing is going to take much longer (see below logs, time unit in seconds). I cannot share the test file for now. I will need to ask for permission. Is there a way to make it faster? Also I see the logs for pages requiring longer processing time. Feb 04, 2021 5:39:20 PM org.apache.pdfbox.rendering.TilingPaint getAnchorRect INFO: Pattern surface is too large, will be clipped Feb 04, 2021 5:39:20 PM org.apache.pdfbox.rendering.TilingPaint getAnchorRect INFO: width: 4405.8223, height: -4405.8223 Feb 04, 2021 5:39:20 PM org.apache.pdfbox.rendering.TilingPaint getAnchorRect INFO: XStep: 1707.63, YStep: 1707.63 Feb 04, 2021 5:39:20 PM org.apache.pdfbox.rendering.TilingPaint getAnchorRect INFO: bbox: [-54.8253,-217.611,1652.8,1490.02] Feb 04, 2021 5:39:20 PM org.apache.pdfbox.rendering.TilingPaint getAnchorRect INFO: pattern matrix: [2.58008,0.0,0.0,-2.58008,0.0,540.0] Feb 04, 2021 5:39:20 PM org.apache.pdfbox.rendering.TilingPaint getAnchorRect INFO: concatenated matrix: [2.58008,0.0,0.0,-2.58008,0.0,540.0] Logs showing objects count and processing duration per page for the file with PDFBox: [main] INFO doc.DocumentProcessorUtils - page 0 has 20 objs. [main] INFO doc.DocumentProcessorUtils - page 1 has 24 objs. [main] INFO doc.DocumentProcessorUtils - page 2 has 176 objs. [main] INFO doc.DocumentProcessorUtils - page 3 has 21 objs. [main] INFO doc.DocumentProcessorUtils - page 4 has 26 objs. [main] INFO doc.DocumentProcessorUtils - page 5 has 21 objs. [main] INFO doc.DocumentProcessorUtils - page 6 has 138 objs. [main] INFO doc.DocumentProcessorUtils - page 7 has 33 objs. [main] INFO doc.DocumentProcessorUtils - page 8 has 22 objs. [main] INFO doc.DocumentProcessorUtils - page 9 has 26 objs. [main] INFO doc.DocumentProcessorUtils - page 10 has 52 objs. [ForkJoinPool.commonPool-worker-10] INFO doc.Pdf2Image - Page 3 takes 0.803. [ForkJoinPool.commonPool-worker-13] INFO doc.Pdf2Image - Page 8 takes 0.805. [ForkJoinPool.commonPool-worker-8] INFO doc.Pdf2Image - Page 4 takes 0.822. [ForkJoinPool.commonPool-worker-15] INFO doc.Pdf2Image - Page 0 takes 0.852. [ForkJoinPool.commonPool-worker-11] INFO doc.Pdf2Image - Page 5 takes 0.892. [ForkJoinPool.commonPool-worker-4] INFO doc.Pdf2Image - Page 1 takes 0.901. [ForkJoinPool.commonPool-worker-6] INFO doc.Pdf2Image - Page 7 takes 0.962. [ForkJoinPool.commonPool-worker-2] INFO doc.Pdf2Image - Page 9 takes 1.075. [ForkJoinPool.commonPool-worker-1] INFO doc.Pdf2Image - Page 10 takes 73.145. [ForkJoinPool.commonPool-worker-9] INFO doc.Pdf2Image - Page 2 takes 201.11. [main] INFO doc.Pdf2Image - Page 6 takes 202.048. Also I tried to use ImageMagick to do the same thing with the same DPI and this is what I get, which seems much faster for pages with more objects, although it is a bit slower than PDFBox for other pages. [main] INFO doc.DocumentProcessorUtils - page 0 has 20 objs. [main] INFO doc.DocumentProcessorUtils - page 1 has 24 objs. [main] INFO doc.DocumentProcessorUtils - page 2 has 176 objs. [main] INFO doc.DocumentProcessorUtils - page 3 has 21 objs. [main] INFO doc.DocumentProcessorUtils - page 4 has 26 objs. [main] INFO doc.DocumentProcessorUtils - page 5 has 21 objs. [main] INFO doc.DocumentProcessorUtils - page 6 has 138 objs. [main] INFO doc.DocumentProcessorUtils - page 7 has 33 objs. [main] INFO doc.DocumentProcessorUtils - page 8 has 22 objs. [main] INFO doc.DocumentProcessorUtils - page 9 has 26 objs. [main] INFO doc.DocumentProcessorUtils - page 10 has 52 objs. [ForkJoinPool.commonPool-worker-2] INFO doc.ProcessDoc - Page 9 takes 1.684. [ForkJoinPool.commonPool-worker-11] INFO doc.ProcessDoc - Page 1 takes 2.081. [ForkJoinPool.commonPool-worker-8] INFO doc.ProcessDoc - Page 5 takes 2.095. [ForkJoinPool.commonPool-worker-4] INFO doc.ProcessDoc - Page 8 takes 2.208. [ForkJoinPool.commonPool-worker-15] INFO doc.ProcessDoc - Page 7 takes 2.336. [ForkJoinPool.commonPool-worker-10] INFO doc.ProcessDoc - Page 3 takes 2.443. [ForkJoinPool.commonPool-worker-13] INFO doc.ProcessDoc - Page 4 takes 2.485. [ForkJoinPool.commonPool-worker-6] INFO doc.ProcessDoc - Page 0 takes 3.722. [ForkJoinPool.commonPool-worker-1] INFO doc.ProcessDoc - Page 10 takes 3.765. [main] INFO doc.ProcessDoc - Page 6 takes 4.479. [ForkJoinPool.commonPool-worker-9] INFO doc.ProcessDoc - Page 2 takes 4.51.

