Package: mhonarc
Version: 2.6.10-1
Severity: normal

*** Please type your report below this line ***

On my system (512Mb RAM & 1300Mb swap) mhonarc fail 
with "out of memory" error. Mailbox size for processing 
approximately 30Mb.

Huge memory need for "s{}{}" operator at /usr/share/mhonarc/MHonArc/Char.pm:86
(See also http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=332627)

Small exapmle demonstrate it.
test1.pl with original code:
-----------------------------------------------------------------------
$iso1=require "/usr/share/mhonarc/MHonArc/CharEnt/ISO8859_1.pm";
$alt=require "/usr/share/mhonarc/MHonArc/CharEnt/CP866.pm";
my @maps=();
push @maps,$iso1;
push @maps,$alt;

$data="A BCD12\x80\x90\xA0\xB0\xC0\xD0\xE0\xF0\n" x 300000;
print "Letter size: ",int(length($data)/1024)," Kb\n";
$data_r=\$data;

{
# /usr/share/mhonarc/MHonArc/Char.pm:86
    # Single byte charset
    my($map, $char);
    $$data_r =~ s{
        ([\x00-\xFF])
    }{
        foreach $map (@maps) {
            $char = $map->{$1};
            last  if defined($char);
        }
        unless (defined($char)) {
            $char = (ord($1) <= 0x7F) ? $1 : '?';
        }
        $char;
    }gxe;
}

print "Result size: ",int(length($data)/1024)," Kb\n";
print "Memory usage: ",`grep ^VmSize /proc/$$/status`;
-----------------------------------------------------------------------

test2.pl "work around" code with same processing result.
-----------------------------------------------------------------------
$iso1=require "/usr/share/mhonarc/MHonArc/CharEnt/ISO8859_1.pm";
$alt=require "/usr/share/mhonarc/MHonArc/CharEnt/CP866.pm";
my @maps=();
push @maps,$iso1;
push @maps,$alt;

$data="A BCD12\x80\x90\xA0\xB0\xC0\xD0\xE0\xF0\n" x 300000;
print "Letter size: ",int(length($data)/1024)," Kb\n";
$data_r=\$data;

{
    my($map,$char,$code,%summap,$summap);
    for($code=0x00; $code<=0xFF; $code++) {
        foreach $map (@maps) {
            $char = $map->{chr($code)};
            last  if defined($char);
        }
        unless(defined($char)) {
            next if($code <= 0x7F);
            $char = '?';
        }
        $summap{chr($code)} = $char;
        $summap .= chr($code);
    }
    $$data_r =~ s/([$summap])/$summap{$1}/g;
}

print "Result size: ",int(length($data)/1024)," Kb\n";
print "Memory usage: ",`grep ^VmSize /proc/$$/status`;
-----------------------------------------------------------------------

Compare time and memory usage:
$ time perl test1.pl
Letter size: 4687 Kb
Result size: 16992 Kb
Memory usage: VmSize:     336932 kB
real    0m44.463s user    0m42.034s sys     0m2.404s

$ time perl test2.pl
Letter size: 4687 Kb
Result size: 16992 Kb
Memory usage: VmSize:      29552 kB
real    0m5.759s  user    0m5.232s  sys     0m0.523s

10 times better for memory and CPU time.

Patch for ONE place of many s{}{} operators.
--- Char.pm.orig        2004-05-17 21:03:42.000000000 -0400
+++ Char.pm     2005-10-07 10:50:38.789442720 -0400
@@ -84,19 +84,20 @@
     }

     # Single byte charset
-    my($map, $char);
-    $$data_r =~ s{
-       ([\x00-\xFF])
-    }{
+    my($map,$char,$code,%summap,$summap);
+    for($code=0x00; $code<=0xFF; $code++) {
        foreach $map (@maps) {
-           $char = $map->{$1};
+           $char = $map->{chr($code)};
            last  if defined($char);
        }
-       unless (defined($char)) {
-           $char = (ord($1) <= 0x7F) ? $1 : '?';
+       unless(defined($char)) {
+           next if($code <= 0x7F);
+           $char = '?';
        }
-       $char;
-    }gxe;
+       $summap{chr($code)} = $char;
+       $summap .= chr($code);
+    }
+    $$data_r =~ s/([$summap])/$summap{$1}/gxe;
     $$data_r;
 }

On real mail box:
With patch "user tyme" -- 1m36, w/o 2m23

This is MHonArc v2.6.10, Perl 5.008004 linux
Converting messages to /var/www/mail/support/200410
Reading mbox.arc.200410 ..............................................
Writing mail .........................................................
Writing database ...
1765 new messages
1765 total messages

real    5m17.047s         user    2m23.640s        sys     0m6.090s
31222 guest      14   0 91620  89M  1632 R    41.6 17.7   2:29 perl


This is MHonArc v2.6.10, Perl 5.008004 linux
Converting messages to /var/www/mail/support/200410
Reading mbox.arc.200410 ...............................................
Writing mail ..........................................................
Writing database ...
1765 new messages
1765 total messages

real    4m59.295s          user    1m36.150s       sys     0m5.940s
32535 guest      15   0 89568  87M  1628 R    34.5 17.3   1:41 perl




-- System Information:
Debian Release: 3.1
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: i386 (i686)
Kernel: Linux 2.6.8-2-686
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968)

Versions of packages mhonarc depends on:
ii  perl                          5.8.4-8    Larry Wall's Practical Extraction 

-- no debconf information


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to