On Mon, Jun 17, 2013 at 08:35:28PM +0200, gregor herrmann wrote: > Control: tag -1 + patch > > On Thu, 06 Jun 2013 22:45:23 +0100, Dominic Hargreaves wrote: > > > Strings with code points over 0xFF may not be mapped into in-memory file > > handles > > readline() on closed filehandle $in at > > /build/dom-libhtml-copy-perl_1.30-1-i386- > > fEvCSD/libhtml-copy-perl-1.30/blib/lib/HTML/Copy.pm line 255. > > Use of uninitialized value in subroutine entry at > > /build/dom-libhtml-copy-perl_1 > > .30-1-i386-fEvCSD/libhtml-copy-perl-1.30/blib/lib/HTML/Copy.pm line 258. > > Use of uninitialized value in concatenation (.) or string at > > /build/dom-libhtml- > > copy-perl_1.30-1-i386-fEvCSD/libhtml-copy-perl-1.30/blib/lib/HTML/Copy.pm > > line 2 > > 76. > > Can't guess encoding of at > > /build/dom-libhtml-copy-perl_1.30-1-i386-fEvCSD/libh > > tml-copy-perl-1.30/blib/lib/HTML/Copy.pm line 276. > > # Looks like you planned 16 tests but ran 6. > > # Looks like your test exited with 255 just after 6. > > t/parse.t .... > > Dubious, test returned 255 (wstat 65280, 0xff00) > > Failed 1/2 test programs. 0/7 subtests failed. > > Failed 10/16 subtests > > "Strings with code points over 0xFF may not be mapped into in-memory file > handles" > happens t/parse.t, line 181: > open my $in, "<", \$src_html_utf8; > (where $src_html_utf8 contains HTML with some nice characters (ああ) in > it). > > perldiag says: > > Strings with code points over 0xFF may not be mapped into in-memory file > handles > > (W utf8) You tried to open a reference to a scalar for read or > append where the scalar contained code points over 0xFF. > In-memory files model on-disk files and can only contain bytes. > > > Some searching indicates that strategically dropping some > encode_utf8() in the code might help ... Let's try ... Ok, here we are: > > #v+ > diff --git a/t/parse.t b/t/parse.t > index 1550268..15eb8c6 100644 > --- a/t/parse.t > +++ b/t/parse.t > @@ -6,6 +6,7 @@ use HTML::Copy; > use utf8; > use File::Spec::Functions; > #use Data::Dumper; > +use Encode qw(encode_utf8 decode_utf8); > > use Test::More tests => 16; > > @@ -109,7 +110,7 @@ $copy_html = do { > ok($copy_html eq $result_html_nocharset, "copy_to no charset shift_jis"); > > ##== HTML with charset uft-8 > -my $src_html_utf8 = <<EOT; > +my $src_html_utf8 = encode_utf8(<<EOT); > <!DOCTYPE html> > <html> > <head> > @@ -126,7 +127,7 @@ my $src_html_utf8 = <<EOT; > </html> > EOT > > -my $result_html_utf8 = <<EOT; > +my $result_html_utf8 = encode_utf8(<<EOT); > <!DOCTYPE html> > <html> > <head> > @@ -174,7 +175,7 @@ $copy_html = do { > read_and_unlink($destination, $p); > }; > > -ok($copy_html eq $result_html_utf8, "copy_to giviing a file handle"); > +ok($copy_html eq decode_utf8($result_html_utf8), "copy_to giviing a file > handle"); > > ##=== copy_to gving file handles for input and output > $copy_html = do { > @@ -187,7 +188,7 @@ $copy_html = do { > Encode::decode($p->encoding, $outdata); > }; > > -ok($copy_html eq $result_html_utf8, "copy_to giviing file handles for input > and output"); > +ok($copy_html eq decode_utf8($result_html_utf8), "copy_to giviing file > handles for input and output"); > > ##=== parse_to giving a file handle > $copy_html = do { > @@ -196,7 +197,7 @@ $copy_html = do { > $p->parse_to($destination); > }; > > -ok($copy_html eq $result_html_utf8, "copy_to giviing file handles for input > and output"); > +ok($copy_html eq decode_utf8($result_html_utf8), "copy_to giviing file > handles for input and output"); > > ##=== copy_to with directory destination > $copy_html = do { > #v- > > > I'm committing this now but some sanity check would be appreciated.
At a glance, this seems sane, but I guess upstream should be given a chance to comment too (whether before or after you upload the fix to Debian). Cheers, Dominic. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org