Control: found -1 iceweasel/30.0-2
On Wed, 24 Jul 2013 13:34:08 +0200 Carlo Stemberger wrote: [...] > Hi, > I set the Fallback Character Encoding to UTF-8, but decoding doesn't > work properly with this[1] page. No problem by using Chromium. Hello, I am also experiencing this issue. Actually, the situation seems to be even worse with the version currently in Debian testing (iceweasel/30.0-2). I have the following locale settings: $ locale LANG=en_US.UTF-8 LANGUAGE= LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL= and I set Fallback Character Encoding to "Default for Current Locale" (in Edit menu, Preferences, Content section, Advanced... dialog window), which I understand should result in UTF-8 for my case! Despite all this, I often see that pages which do not explicitly declare the charset are displayed with "Western" character encoding (I see this in the View menu, Character Encoding submenu). I would instead expect to see them displayed with "Unicode" encoding... One example is the web archive for Debian mailing lists, such as: https://lists.debian.org/debian-security-tracker/2014/07/maillist.html Another example is the following minimal HTML file: $ cat hello.html <html> <head> <title>Hello!</title> </head> <body> <h1>Hello → to you!</h1> </body> </html> $ iceweasel -new-tab hello.html which is incorrectly displayed with "Western" character encoding. Manually setting "Unicode" encoding (View menu, Character Encoding submenu) makes the arrow show up correctly. Adding XML and DOCTYPE declarations does not seem to help: $ cat hello_strict.html <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"> <head> <title>Hello!</title> </head> <body> <h1>Hello → to you!</h1> </body> </html> $ iceweasel --new-tab hello_strict.html again incorrectly displayed with "Western" character encoding. Adding the Content-Type meta declaration finally makes Iceweasel recognize the actual encoding (UTF-8): $ cat hello_strict_expchar.html <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"> <head> <title>Hello!</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> </head> <body> <h1>Hello → to you!</h1> </body> </html> $ iceweasel --new-tab hello_strict_expchar.html but in this final case, I understand that the fallback mechanism is not used at all (please correct me, if I am wrong). Is there any progress on this bug? Please fix it and/or forward the report upstream. Thanks for your time! Bye. -- http://www.inventati.org/frx/ fsck is a four letter word... ..................................................... Francesco Poli . GnuPG key fpr == CA01 1147 9CD2 EFDF FB82 3925 3E1C 27E1 1F69 BFFE
pgpPOZ7xCDFyj.pgp
Description: PGP signature