Package: perl Version: 5.8.8-6 Severity: normal Please excuse the bug title, after working on this for something like 5 hours, I cannot think clearly enough to write a short title describing this very weird bug. Let the code speak for me. I have attached a testcase; untar it, run the "repro" program.
[EMAIL PROTECTED]:~/tmp/repor/testcase>./repro a b Wide character in subroutine entry at /usr/bin/markdown line 360. zsh: exit 255 ./repro Now, edit the repro file. There are 4 comments suggesting changes; if you make any one of the changes, the wide character failure disappears. Notice that several of the changes should not possibly affect anything, but do. For example, uncommenting the s/// line should be a null change because $mommy is otherwise utterly unused. But umcommenting that line "fixes" the problem. This smells deeply of a perl bug to me. I boiled this test case down from several thousand lines of code, dealing with many changes like this that inexplicably hid the problem. I should probably do a similar reduction on markdown and possibly HTML::Scrubber, but it's getting late. Their versions here are listed below. Here's some analysis of what's going on inside markdown when it fails: <paravoid> watch this: <paravoid> print 'text is utf: ', utf8::is_utf8($text) ? 'yes' : 'no', "\n"; <paravoid> $text =~ s{ <paravoid> ( # save in $1 <paravoid> ^ # start of line (with /m) <paravoid> <($block_tags_a) # start tag = $2 <paravoid> \b # word break <paravoid> (.*\n)*? # any number of lines, minimally matching <paravoid> </\2> # the matching end tag <paravoid> [ \t]* # trailing spaces/tabs <paravoid> ) <paravoid> }{ <paravoid> print '$1 is utf: ', utf8::is_utf8($1) ? 'yes' : 'no', "\n"; <paravoid> my $key = md5_hex($1); <paravoid> $g_html_blocks{$key} = $1; <paravoid> "\n\n" . $key . "\n\n"; <paravoid> }egmx; <paravoid> I added the two 'prints' <paravoid> text is utf: no <paravoid> $1 is utf: yes <paravoid> that's freaking weird <paravoid> the utf8 flag gets enabled after the regexp is run Also note that paravoid had a version (much larger; a small modification to ikiwiki) that reproduced the bug w/o HTML::Scrubber being loaded. As far as I can guess, the HTML::Scrubber stuff doesn't really have any bearing on the bug and is just one more mysterious thing that hides the bug if it's removed. -- System Information: Debian Release: testing/unstable APT prefers unstable APT policy: (500, 'unstable'), (1, 'experimental') Architecture: i386 (i686) Shell: /bin/sh linked to /bin/bash Kernel: Linux 2.6.17-1-686 Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Versions of packages perl depends on: ii libc6 2.3.6-15 GNU C Library: Shared libraries ii libdb4.4 4.4.20-6 Berkeley v4.4 Database Libraries [ ii libgdbm3 1.8.3-3 GNU dbm database routines (runtime ii perl-base 5.8.8-6 The Pathologically Eclectic Rubbis ii perl-modules 5.8.8-6 Core Perl modules Versions of packages perl recommends: ii perl-doc 5.8.8-6 Perl documentation Other software: ii markdown 1.0.1-3 Text-to-HTML conversion tool ii libhtml-scrubb 0.08-2 Perl extension for scrubbing/sanitizing html paravoid reproduced it using a similar test case on a system running sarge with: <paravoid> ii perl 5.8.4-8sarge4 Larry Wall's Practical Extraction and Report <paravoid> ii markdown 1.0.1-2 Text-to-HTML conversion tool <paravoid> ii libhtml-scrubb 0.08-1 Perl extension for scrubbing/sanitizing html -- see shy jo
signature.asc
Description: Digital signature