Bug#722305: discount fails to process Markdown in non-ASCII text correctly

Ivan Shmakov Mon, 09 Sep 2013 22:01:26 -0700

Package: libmarkdown2
Version: 2.1.6-2

        As packaged, Discount appears to fail to process Markdown
        *emphasis* around text fragments containing non-ASCII
        characters.  Consider, e. g.:


$ cat < test.mdwn 
*Hello, world!*

*Привет, мир!*

*Hello, world!*  *Привет, мир!*  *Hello, world!*

$ markdown < test.mdwn 
<p><em>Hello, world!</em></p>

<p>*Привет, мир!*</p>

<p><em>Hello, world!</em>  *Привет, мир!<em>  </em>Hello, world!*</p>
$ 

        Note that the *emphasis* form consistenly fails when used with
        text containing Cyrillic (and thus non-ASCII) characters.

        The Perl Text::Markdown::Discount interface fails similarly:

$ perl  -e 'use common::sense;
            use utf8;
            require Encode;
            require Encode::Locale;
            binmode (STDOUT, ":encoding(locale)");
            require Text::Markdown::Discount;
            my $text
                = ("*Hello, world!*\n\n"
                   . "*\x{41f}\x{440}\x{438}\x{432}\x{435}\x{442},"
                   . " \x{43c}\x{438}\x{440}!*\n\n");
            print (scalar (Text::Markdown::Discount::markdown ($text)));' 
<p><em>Hello, world!</em></p>

<p>*Привет, мир!*</p>
$ 

        I thus assume that the bug is in the libmarkdown2 library,
        underlying both markdown(1) (from the discount package) and
        Text::Markdown::Discount (libtext-markdown-discount-perl.)

        Strangely enough, the **strong** form /is/ processed, as is the
        `code` one.

        FWIW, the locale appears to be set correctly:

$ locale 
LANG=ru_RU.UTF-8
LANGUAGE=
LC_CTYPE="ru_RU.UTF-8"
LC_NUMERIC="ru_RU.UTF-8"
LC_TIME="ru_RU.UTF-8"
LC_COLLATE="ru_RU.UTF-8"
LC_MONETARY="ru_RU.UTF-8"
LC_MESSAGES="ru_RU.UTF-8"
LC_PAPER="ru_RU.UTF-8"
LC_NAME="ru_RU.UTF-8"
LC_ADDRESS="ru_RU.UTF-8"
LC_TELEPHONE="ru_RU.UTF-8"
LC_MEASUREMENT="ru_RU.UTF-8"
LC_IDENTIFICATION="ru_RU.UTF-8"
LC_ALL=
$ 

-- 
FSF associate member #7257      http://sf-day.org/


--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Bug#722305: discount fails to process Markdown in non-ASCII text correctly

Reply via email to