Package: libmail-mboxparser-perl Version: 0.55-1 Severity: normal Dear libmail-mboxparser-perl maintainer,
Mail::MboxParser::Mail->header returns incorrect results for headers where the colon is followed by something else than a space. Consider the following mailbox : ------------------------------------------------------------------------ >From [EMAIL PROTECTED] Sat Dec 22 12:03:25 CET 2007 From: [EMAIL PROTECTED] Date: Sat, 22 Dec 2007 14:41:15 +0100 Header-Name:header-body-that-does-not-begin-with-a-space Header-Name: header-body-that-begins-with-a-tab Body ------------------------------------------------------------------------ Running this code on it : ------------------------------------------------------------------------ #!/usr/bin/perl -w use strict; use Mail::MboxParser; sub quote ($) { my $result = $_[0]; $result =~ s{(.)}{ord $1 < 0x20 || ord $1 > 0x7e || $1 eq '%' ? sprintf '%%%02X', ord $1 : $1}eg; return $result; } my $mbp = Mail::MboxParser->new ($ARGV[0]); for (my $mnum = 1; my $msg = $mbp->next_message; $mnum++) { my $headers = $msg->header; foreach my $hname (sort keys %$headers) { my @values = ($headers->{$hname}); @values = @{$values[0]} if (ref $values[0]); foreach my $hval (@values) { print "message $mnum, name \"", quote ($hname), "\", value \"", quote ($hval), "\"\n"; } } print "\n"; } ------------------------------------------------------------------------ produces : ------------------------------------------------------------------------ message 1, name "", value "" message 1, name "date", value "Sat, 22 Dec 2007 14:41:15 +0100" message 1, name "from", value "[EMAIL PROTECTED]" message 1, name "header-name:%09header-body-that-begins-with-a-ta", value "" message 1, name "header-name:header-body-that-does-not-begin-with-a-spac", value "" ------------------------------------------------------------------------ In the 4th and 5th lines, the header body is shortened by one character and folded into its name. My understanding of RFC 2822 is that the 5th line should be : ------------------------------------------------------------------------ message 1, name "header-name:", value "header-body-that-does-not-begin-with-a-space" ------------------------------------------------------------------------ and the 4th line should be either : ------------------------------------------------------------------------ message 1, name "header-name:", value "%09header-body-that-begins-with-a-tab" ------------------------------------------------------------------------ or, to be consistent with the removal of leading white space elsewhere : ------------------------------------------------------------------------ message 1, name "header-name:", value "header-body-that-begins-with-a-tab" ------------------------------------------------------------------------ This bug is more serious than the "normal" severity would suggest. Headers where the colon is not followed by a space occur often enough in the real world that a mailbox of any size is almost guaranteed to contain some. This module is in effect nearly unusable. -- System Information: Debian Release: testing/unstable APT prefers testing APT policy: (500, 'testing') Architecture: i386 (i686) Shell: /bin/sh linked to /bin/bash Locale: LANG=C, LC_CTYPE=en_US (charmap=ISO-8859-1) Versions of packages libmail-mboxparser-perl depends on: ii libmime-perl 5.420-0.1 Perl5 modules for MIME-compliant m ii perl 5.8.8-7 Larry Wall's Practical Extraction ii perl-modules [libfile-temp-pe 5.8.8-7 Core Perl modules Versions of packages libmail-mboxparser-perl recommends: ii libmail-mbox-messageparser-pe 1.4005-1 fast and simple mbox folder reader -- no debconf information -- André Majorel <http://www.teaser.fr/~amajorel/> lists.debian.org, a spammer's delight.