[Mailman-Users] Charset chaos

2022-01-03 Thread Johannes Rohr

Dear all,

I am trying to understand how to fix this issue, sorry if it came up 
before. It must have, still I couldn't find a working solution through 
Google:


The default language of my server is de (German), which has several 
non-ASCII characters (umlauts & ß).


Lately the issue is that in the standard setting 
(DEFAULT_SERVER_LANGUAGE = 'de'), all the system messages are rendered 
properly, (for instance on the list overview page: "Für Administratoren 
der Listen gibt es die Seite Übersichtsseite für Listenadministratoren 
zur Verwaltung der eigenen Liste. ") but the the UTF-8 of the list 
descriptions is rendered as two-character sequences: "Diskussion 
(öffentlich)" instead of "Diskussion (Öffentlich)", the same is true 
for the footers of all messages delivered to subscribers: "Für 
Eintragung, Löschung (...)" instead öf "Für Eintragung, Löschung (...)".


Now, I tried to remedy this by adding "add_language('de', 'Deutsch', 
'utf-8', 'ltr')" to /etc/mailman/mm_cfg.py but after doing so, while the 
UTF-8 code is now rendered correctly as single characters, the 
translated system messages are now being garbled: "Unten finden Sie eine 
Aufstellung aller �ffentlichen Mailinglisten (...)"


How do I ensure that both the system messages? The .po file for de seems 
to be in latin1 (iso-8859-1), not UTF-8. Is this on purpose? The Russian 
one is in proper UTF-8, but the French and Spanish ones are also in 
Latin1. The server is on Ubuntu 20.04, running mailman 2.1.29, which 
seems to be way behind upstream.


I kind of half fixed it locally by

1. recoding the po file from latin1 to utf-8,
2. re-running msgfmt,
3. re-running dpkg-reconfigure -plow mailman and selecting the default
   language
4. adding add_language('de', 'Deutsch', 'utf-8', 'ltr') to
   /etc/mailman/mm_cfg.py
5. restarting mailman

At first glance, everything seems to be in order now, just that the main 
welcome pages both for users and administrators are now stubbornly in 
English, while everything else in German, now with proper Umlauts. Is 
this an issue with the Debian/Ubuntu package? What is the canonical way 
of dealing with it?


Thanks a lot for your advice!

Johannes
--
Mailman-Users mailing list -- mailman-users@python.org
To unsubscribe send an email to mailman-users-le...@python.org
https://mail.python.org/mailman3/lists/mailman-users.python.org/
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: https://www.mail-archive.com/mailman-users@python.org/
   https://mail.python.org/archives/list/mailman-users@python.org/


[Mailman-Users] Re: Charset chaos

2022-01-03 Thread Mark Sapiro

On 1/3/22 4:12 PM, Johannes Rohr wrote:


Lately the issue is that in the standard setting 
(DEFAULT_SERVER_LANGUAGE = 'de'), all the system messages are rendered 
properly, (for instance on the list overview page: "Für Administratoren 
der Listen gibt es die Seite Übersichtsseite für Listenadministratoren 
zur Verwaltung der eigenen Liste. ") but the the UTF-8 of the list 
descriptions is rendered as two-character sequences: "Diskussion 
(öffentlich)" instead of "Diskussion (Öffentlich)", the same is true 
for the footers of all messages delivered to subscribers: "Für 
Eintragung, Löschung (...)" instead öf "Für Eintragung, Löschung (...)".



Yes, That's because the charset for German is iso-8859-1 so those 
attributes need to be set as iso-8859-1 strings or ...


Now, I tried to remedy this by adding "add_language('de', 'Deutsch', 
'utf-8', 'ltr')" to /etc/mailman/mm_cfg.py but after doing so, while the 
UTF-8 code is now rendered correctly as single characters, the 
translated system messages are now being garbled: "Unten finden Sie eine 
Aufstellung aller �ffentlichen Mailinglisten (...)"


How do I ensure that both the system messages? The .po file for de seems 
to be in latin1 (iso-8859-1), not UTF-8. Is this on purpose? The Russian 
one is in proper UTF-8, but the French and Spanish ones are also in 
Latin1. The server is on Ubuntu 20.04, running mailman 2.1.29, which 
seems to be way behind upstream.



Mailman 2.1 is almost 20 years old and predates the common use of 
unicode and utf-8. That's why the German message catalog is iso-8859-1 
encoded. What you did below is the appropriate fix.


And yes, Debian/Ubuntu packages are old. See 
https://wiki.list.org/x/17891606 for what to do about that.




I kind of half fixed it locally by

1. recoding the po file from latin1 to utf-8,
2. re-running msgfmt,
3. re-running dpkg-reconfigure -plow mailman and selecting the default
    language
4. adding add_language('de', 'Deutsch', 'utf-8', 'ltr') to
    /etc/mailman/mm_cfg.py
5. restarting mailman

At first glance, everything seems to be in order now, just that the main 
welcome pages both for users and administrators are now stubbornly in 
English, while everything else in German, now with proper Umlauts. Is 
this an issue with the Debian/Ubuntu package? What is the canonical way 
of dealing with it?


If mm_cfg.py has

DEFAULT_SERVER_LANGUAGE = 'de'

the listinfo and admin overview pages should be rendered in German. If

dpkg-reconfigure -plow mailman

doesn't set that, it's a Debian/Ubuntu issue.

--
Mark Sapiro The highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan
--
Mailman-Users mailing list -- mailman-users@python.org
To unsubscribe send an email to mailman-users-le...@python.org
https://mail.python.org/mailman3/lists/mailman-users.python.org/
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: https://www.mail-archive.com/mailman-users@python.org/
   https://mail.python.org/archives/list/mailman-users@python.org/


[Mailman-Users] Re: Charset chaos

2022-01-03 Thread Stephen J. Turnbull
Mark Sapiro writes:

 > Mailman 2.1 is almost 20 years old and predates the common use of
 > unicode and utf-8. That's why the German message catalog is
 > iso-8859-1 encoded. What you did below is the appropriate fix.

Speaking to Johannes and those in his situation:

To be honest, in the past we could have done for you what you did for
yourself, but that would surely cause the opposite problem for a lot
of existing systems, not to forget annoying the translators.  For many
years, our European users were perfectly happy with iso-8859-1, and
many continued to use that by default in their text (or at least were
happy to do so for the relatively few changes they tended to make in
footers, list descriptions, and the like).  Nowadays of course most
systems default to UTF-8, there's little reason to ever use anything
different, and so you unfortunately ended up with a system with mixed
charsets (I wouldn't go so far as "chaos," I live in Japan and know
chaos in charsets intimately ;-).

Until Mailman 2 went EOL from a development point of view a few years
ago, we (well, me, but I expect the other devs will agree) didn't
think it's worth the chaos for existing Mailman 2 installations that
haven't been reconfigured in a decade or so that would certainly ensue
from our decision to go "all UTF-8".  Sorry about being so inertial
about it, but I think it still was the right way to go.  Now of course
Mailman 2 is EOL, so it makes sense to support you individually (OK,
you supported yourself, and we just say "well done" ;-), but not to
make changes to Mailman 2.

Sincere regards,
Steve

--
Mailman-Users mailing list -- mailman-users@python.org
To unsubscribe send an email to mailman-users-le...@python.org
https://mail.python.org/mailman3/lists/mailman-users.python.org/
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: https://www.mail-archive.com/mailman-users@python.org/
https://mail.python.org/archives/list/mailman-users@python.org/