Package: lynx-cur
Version: 2.8.9dev1-2

From the utf-8(7) manpage: "The Unicode and UCS standards require that producers of UTF‐8 shall use the shortest form possible, for example, producing a two‐byte sequence with first byte 0xc0 is nonconforming. Unicode 3.1 has added the requirement that conforming programs must not accept non‐shortest forms in their input."

But lynx happily accepts such overlong sequences:

$ lynx -dump utf8.html
  If you see this, the parser accepts overlong UTF-8 sequences.


-- System Information:
Debian Release: jessie/sid
 APT prefers unstable
 APT policy: (990, 'unstable'), (500, 'experimental')
Architecture: i386 (x86_64)
Foreign Architectures: amd64

Kernel: Linux 3.16-2-amd64 (SMP w/2 CPU cores)
Locale: LANG=C, LC_CTYPE=pl_PL.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages lynx-cur depends on:
ii  libbsd0            0.7.0-2
ii  libbz2-1.0         1.0.6-7
ii  libc6              2.19-11
ii  libgcrypt20        1.6.2-3
ii  libgnutls-deb0-28  3.3.8-2
ii  libidn11           1.29-1
ii  libncursesw5       5.9+20140913-1
ii  libtinfo5          5.9+20140913-1
ii  zlib1g             1:1.2.8.dfsg-2

--
Jakub Wilk

Attachment: utf8.html.gz
Description: application/gzip

Reply via email to