Package: gscan2pdf Version: 1.0.0-1 Severity: important Tags: patch when you try to export a ocropus ocr-text into a pdf the result is a very slow export, very large document, very large memory consumption during and after the export.
That's because fonts are added multiple times to the document. :-( Just add the fonts only one time at the beginning of the export like the added patch. Improves 50 page doc speed 30 times, size more than 10 times, no noticeable memory problem (but I guess still existing). --- /usr/share/perl5/Gscan2pdf.pm 2011-08-27 07:00:41.000000000 +0200 +++ /usr/share/perl5/Gscan2pdf.pm 2011-10-22 23:56:43.420286711 +0200 @@ -434,11 +434,15 @@ my ( $self, $path, $list_of_pages, $metadata, $options, $pidfile ) = @_; my $page = 0; + my %fonthash = (); # Create PDF with PDF::API2 $self->{message} = $d->get('Setting up PDF'); my $pdf = PDF::API2->new( -file => $path ); $pdf->info($metadata) if defined($metadata); + + $fonthash{ $options->{font} } = $pdf->ttfont( $options->{font}, -unicodemap => 1 ); + $fonthash{ 'Times-Roman' } = $pdf->corefont('Times-Roman'); foreach my $pagedata ( @{$list_of_pages} ) { ++$page; @@ -578,10 +582,10 @@ for my $box ( $pagedata->boxes ) { my ( $x1, $y1, $x2, $y2, $txt ) = @$box; if ( $txt =~ /[[:^ascii:]]/ and defined( $options->{font} ) ) { - $font = $pdf->ttfont( $options->{font}, -unicodemap => 1 ); + $font = $fonthash{ $options->{font} }; } else { - $font = $pdf->corefont('Times-Roman'); + $font = $fonthash{'Times-Roman'}; } ( $x2, $y2 ) = ( $w * $resolution, $h * $resolution ) if ( $x1 == 0 and $y1 == 0 and not defined($x2) ); -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org