On Fri, Jan 10, 2020 at 8:38 PM Matemática A3K <matematica....@gmail.com> wrote:
> Trying to recap all the discussion done in the mailing list, Trac and > Github: > > The problem that was originally reported in #30439 was about mixed plural > forms in catalogs bundled with Django, which led to broken translations. > > Then, it added the not announced changes in the plural forms of locales, > which led to break users' translations. > > These problems occur when catalogs gets updated in Django and users > updates their version. > > Michal proposed not merging and doing catalogs look-ups per language, > Claude proposed the "dict-merge" policy (have a dict of merged catalogs > according to their plural forms), which is a variant of Michal's. > > The "dict-merge" policy have the problem of updates in the plural forms in > Django won't reach users' catalogs. > > Whether merge or not merge the catalogs (the merging policy) is about how > to encourage the user to take action, not the fix itself. > > The fix is on the warning of the situation and on the tools for addressing > it. > > The proposed fix is: > 1- Warn the user at a system check level and at run-time > 2- Use the "not-merge" policy to encourage action. > 3- Provide the makemessages --comply-plural-forms tool so both Django and > its users can have consistent plural forms in their catalogs conveniently. > 4- Provide the LOCALE_ROOT setting and the makemessages > --collect-base-catalogs tool so users can make changes in their plural > forms conveniently and persisted across updates. > > The PR is cluttered with a lot of inconsistencies fixes across all > catalogs bundled with Django and the catalogs used for tests. > > Once a warning is raised when merging (either merging or not), most of the > tests will fail because of that warning, so all bundled catalogs (and those > used for tests) have to be aligned in order to have things working. > > Here are the files of the PR filtered: > https://github.com/django/django/pull/12280/files?file-filters%5B%5D=.py&file-filters%5B%5D=.txt > > For ensuring Django's catalogs consistency, there is a test that will fail > if there is any catalog unaligned. The next time Django's catalogs will be > updated, CI will check the consistency. If not, who will do the merge has > to run "python -m django makemessages --comply-plural-forms" and "python -m > django compilemessages" on the corresponding dir (django.conf, > django.contrib.app or the test locale) to fix it. This way, the messages > from the provider will be retained and the consistency be assured. > > There are valid reasons for users to customize their plural forms, i.e. > fixing a broken one while a fix is in the way or use another implementation > of the standard. Having to modify the source distribution for customizing > is not ideal, besides not being persistent across updates and may not be > possible in some setups. This should be done locally, at the project tree. > This is what the LOCALE_ROOT setting addresses. > > Claude already objected this fix though he did not provide reasons ( > https://github.com/django/django/pull/12280#issuecomment-571483273). > > Does anyone see any rationale, design or implementation problem in the > fix? Any comment is welcomed :) > To my surprise, yesterday when I started to work again in this issue - I thought we had agreed for a review after the code was separated in commits - I saw that the ticket was disowned from me. So, I will leave my thoughts here. I think the discussion hasn't been the best, both in constructiveness and fluidity terms, but as I had committed to get this fixed, I have continued working to get the best for it despite the differences. I will write my concerns about the accepted fix ("dict-merge" policy). - Users may be left in an unsatisfactory situation Plural forms only get to the users' catalogs once it is created by makemessages. Once that a catalog is "filled", it is not safe to just update plural forms as it may require content modification. Given that broken (buggy) plural forms have been distributed with Django, those catalogs will remain broken and unnoticed under the "dict-merge" policy. The only rationale that I can think of for justifying this is "If they don't see anything broken and they are happy with it, it doesn't matter". This is not acceptable for me, you should choose under the full understanding of the situation. If you are happy with it, you may choose the situation. for it. If there is another justification, it hasn't been answered the several times that I asked for it. The broken plural forms are mostly broken equations that will never evaluate to that forms, making wasting effort. One thing is completing plural forms for adhering to an standard which will give a better expression in some cases - where it would be a cost of having the better expression for some situations given the design of the software - and another thing is have to do it in the future without any purpose, because a broken equation made its way to the catalog. Under the "don't do to others what you don't like to be done to you" principle, this is not acceptable for me either. If a decision is made, it should be stated the reasons for it. Otherwise, it may lessen the confidence on the project. The right thing to do here to me would be adding at least a note saying something like "Buggy equations have been distributed in the past, you should check your catalog's plural forms with the current main plural forms to see if there is any improvement". The first one was enough reason for not going on with this policy for me. Users won't get updates unless they look for them. Digging more into the policy, I also find the next issues. - It may lead to broken translations in some projects Django translation support is built upon the GNU gettext toolset. People who deal with translations have a workflow determined by it. The expected gettext workflow is managing plural forms in a centralized manner under the assumption that only the main plural forms will be considered. For example, https://github.com/django/django/blob/master/django/contrib/contenttypes/locale/ka/LC_MESSAGES/django.po#L19 have no pluralization enabled while other catalogs do. Managing plural forms in a "de-centralized" way needs to ensure every catalog is "right", "self-contained". Changing will lead to a revision on the plural forms of all the catalogs involved, in order to check that this situation does not affects a catalog (check at least that every catalog has consistent plural forms for the catalog content). The change on this assumption may lead to broken translations in some projects if it goes unnoticed. Therefore, it should be stated in the docs. - The order of precedence is not warrantied This was my main concern with the original Michal proposal, then I thought it could be fixed by using something like an ordered dict. Digging deeper, it produces the same results in some situations but not in every. The first key would be the Django main plural forms, then all the catalogs with different plural forms will be merged in the corresponding key before the first key and taking precedence over the main one. If all the user's catalogs have different plural forms than the main ones, they will take precedence over it (as expected), but if there is one at the highest order of precedence but with the main plural forms, it will be merged in the first key, taking precedence over Django's (expected) but not over the rest (unexpected). Therefore, a note saying something like "Using catalogs with different plural forms may lead to unexpected order of precedences" if this behavior is deemed appropriate. - - Although this policy will bring plural forms customization out-of-the-box (i.e. for convenience when including third-party messages files or using your plural forms), it will have the undesired results previously described. The same goals can be achieved without those "side effects" by the path I was working on. The main concerns raised about was it length and changing they way Django handles translation files fundamentally. Indeed is lengthy, but mostly because of the tools, not because of the changes to the existing Django code. Now that is has been separated into commits, it can be seen more clearly: https://github.com/django/django/pull/12357/commits/d63d9c4cdebae1b5fcb23988c9d757dcc71e124b https://github.com/django/django/pull/12357/commits/795fdc353b56dc268f4f713290299e96117422f0 The plural forms consistency mode is just run-time warning raised at merge and collected at a system-check for the available languages enabled by a setting. I don't find it too long (besides the code for plural form parsing, taken from the tool), and I don't find it changes the way Django handle translation files. Even under the "no-merge" policy - which I proposed at the begin - it didn't, it just added a "stop" to encourage user action (which nobody objected previously). Files were handled the same way once the assumption was confirmed. I added the setting disabled by default because of Tim's comment on another thread (~~"raising a warning about a documented behavior may be rude to developers who knew about it and chose to use it"). This mode will also act as the "Release Notes for plural forms", once Django updates them, it will trigger the warning to make you check (changes of the number of plurals may lead to broken translations, changes in the equation should be bugfixes). The LOCALE_ROOT doesn't change the way Django handle files, it only allows to customize where the merging process starts. As if you set it, you set it locally in your project, you will be able to customize plural forms. The tools are lengthy, indeed, but not complicated in my biased opinion. --collect-bundled runs xgetext over the source files and msgcat with the bundled catalogs to deliver into LOCALE_ROOT: https://github.com/django/django/pull/12357/commits/479fca71108dbb2c2565583b7a17784fdcadc9fd --update-plural-forms allows to automate plurals expansions, trimming and reordering [*] : https://github.com/django/django/pull/12357/commits/c5510cf14963c51dbbc3bc454e72346309a1b862 [*] last Michal comment kept me thinking and I think another iteration is needed. The tools may not be included also. That was my rationale :) > > On Mon, Jan 6, 2020 at 4:17 PM Matemática A3K <matematica....@gmail.com> > wrote: > >> >> >> On Fri, Dec 13, 2019 at 1:04 AM Matemática A3K <matematica....@gmail.com> >> wrote: >> >>> >>> >>> On Fri, Dec 6, 2019 at 2:14 AM Matemática A3K <matematica....@gmail.com> >>> wrote: >>> >>>> >>>> >>>> On Thu, Dec 5, 2019 at 4:23 PM Maciek Olko <maciej.o...@gmail.com> >>>> wrote: >>>> >>>>> Hi, >>>>> I am wondering if Django shouldn't use Unicode Plural Rules as >>>>> standard and promote it for third-party apps. Even if sometimes number of >>>>> forms are not applicable to certain cases, there may be cases when all of >>>>> forms will be needed. >>>>> >>>> >>>> I agree for Django, for third-party translations, for what I understood >>>> (CIIW) this may be a differentiator >>>> >>> >>> Under this rationale, this would be the fix: >>> >>> .. _plural-forms: >>> >>> Plural Forms >>> ~~~~~~~~~~~~ >>> >>> Django does not support multiple plural forms in catalogs. As all >>> translation >>> catalogs are merged, only the plural form in the main Django po file >>> (located in >>> ``django/conf/locale/<lang_code>/LC_MESSAGES/django.po``) is considered. >>> >>> Plural forms in all other po files are ignored by the GNU gettext >>> merging process. >>> >>> Therefore, you shouldn't use different plural forms in your project or >>> application >>> po files, as it may lead to unexpected results. >>> >>> To prevent inconsistencies and undesired behaviors in translations, >>> Django will >>> not merge any catalog that contains a different plural form than the >>> main one >>> and will issue a warning about the conflicting catalog. >>> >>> This conflict may arise mostly in two situations: >>> >>> * when the main plural form for a language is updated >>> in Django and your po files were generated with a previous one, or >>> * when including third-party translations with a different plural form. >>> >>> Django follows the standards provided by `Unicode >>> < >>> http://www.unicode.org/cldr/charts/latest/supplemental/language_plural_rules.html >>> >`_ >>> for plural equations for each language, and encourages to align with it. >>> >>> If you had generated your catalog with a previous version of Django, the >>> standard may have changed or a bug has been fixed. >>> >>> For aligning with the new version of the standard (or addressing the >>> bug), you >>> you may use >>> djadmin:`django-admin makemessages --comply-plural-form<makemessages>` to >>> reorganize your catalog so it is aligned with it. >>> >>> The script will ask you to map each plural form in the new form to your >>> catalog's forms. After this, the messages in your catalog will be >>> reorganized >>> accordingly. In the case of an increase in the number of plurals, it >>> will fill >>> those with the previously chosen forms so you avoid having translation >>> results >>> in the fallback language or the original string - this will produce the >>> same >>> results as before though you may want to update those later for a >>> better expression of the language if it corresponds. >>> >>> In the cases where you want to use a "bleeding-edge-yet-to-be-approved" >>> plural >>> form, define the :setting:`LOCALE_ROOT` and use >>> djadmin:`django-admin makemessages >>> --collect-base-catalogs<makemessages>`. >>> This will merge all the locale catalogs bundled with Django into your >>> :setting:`LOCALE_ROOT` and use them as your main Django po file. >>> >>> .. versionchanged:: 3.1 >>> >>> Handling plural forms as described above was added. >>> >>> The problem that I see to updating plural forms in the package tree is >>> that modifying the base package seems not be a good practice to me >>> (independently of not being persistent across updates), you either copy, >>> subclass, monkey patch, etc. So, I think this is a more proper solution. >>> >>> Any thoughts? >>> >> >> Well, seems that the Three Magic Kings have left us this >> <https://github.com/django/django/pull/12280> >> >> >>> >>> >>>> >>>>> >>>>> Especially if implementing having different plural rules for various >>>>> apps is not trivial. >>>>> >>>>> For the record, here is a section of Transifex documentation that >>>>> describes their statement about plural rules and Unicode standard: >>>>> https://docs.transifex.com/localization-tips-workflows/plurals-and-genders#how-pluralized-strings-are-handled-by-transifex >>>>> . >>>>> >>>>> Regards, >>>>> Maciej >>>>> >>>>> czw., 5 gru 2019 o 08:00 Matemática A3K <matematica....@gmail.com> >>>>> napisał(a): >>>>> >>>>>> While testing the "not-merging" policy, I got this: >>>>>> https://pastebin.com/ihyAiYtc >>>>>> Those warnings should get to the Translators teams if they are not >>>>>> looking here, i.e. >>>>>> >>>>>> https://github.com/django/django/blob/master/django/contrib/sessions/locale/he/LC_MESSAGES/django.po >>>>>> is still using a 2-plurals form in a 4 plural form - it doesn't >>>>>> matter in this case because there are no translations with plurals, but >>>>>> they trigger the warning because the pf hasn't been updated and they >>>>>> should >>>>>> if the Transifex front-end use the pf of a file for showing the options >>>>>> for >>>>>> translating and to be inline with the main form. >>>>>> >>>>>> >>>>>> On Thu, Dec 5, 2019 at 12:10 AM Matemática A3K < >>>>>> matematica....@gmail.com> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Wed, Dec 4, 2019 at 2:25 AM Claude Paroz <cla...@2xlibre.net> >>>>>>> wrote: >>>>>>> >>>>>>>> Le mercredi 4 décembre 2019 03:41:51 UTC+1, Matemática A3K a écrit : >>>>>>>>> >>>>>>>>> (...) >>>>>>>>> >>>>>>>>> But, then I realized that there is major caveat on this approach, >>>>>>>>> and that is that updates on the plural equation won't reach users' >>>>>>>>> catalogs, because their catalogs will be kept separately one the >>>>>>>>> plural >>>>>>>>> form differs. This would be the case that Shai pointed on Django 2.2 >>>>>>>>> with >>>>>>>>> the incorrect plural equation for HE. People who have generated their >>>>>>>>> catalogs with makemessages on 2.2 will have a wrong plural equation, >>>>>>>>> that >>>>>>>>> once it is fixed on a new release, it won't reach their catalogs >>>>>>>>> because >>>>>>>>> they will be kept apart. >>>>>>>>> >>>>>>>> >>>>>>>> Sorry, I'm not following you here. The catalog merge process is >>>>>>>> happening in realtime when Django starts, so any po file update is >>>>>>>> instantly reflected in the translation infrastructure. I don't see any >>>>>>>> caveat here. >>>>>>>> >>>>>>> >>>>>>> Yes, but only for the default language, the rest are lazily loaded >>>>>>> on demand - this is why I think it is better do the check at a system >>>>>>> level, if you have more than a language available, the others will be >>>>>>> merged for loading them once something triggers it (like an >>>>>>> "activate("LANGUAGE_CODE")) and only there you would start to see the >>>>>>> warnings. >>>>>>> >>>>>>> For the example of the caveat of the dict-merging, it would be like >>>>>>> this: >>>>>>> >>>>>>> - Someone starts its translation with Django 2.2 for Hebrew via >>>>>>> makemessages, which copies the main plural form to the new file, and >>>>>>> fills >>>>>>> the translations. >>>>>>> - The user upgrades to Django 2.2.x or up, which contains a fixed >>>>>>> plural form for Hebrew >>>>>>> - Under the "dict-merge" policy, the Django translation catalog >>>>>>> would have 2 entries, one for the main plural form (the new one) and >>>>>>> other >>>>>>> the catalog generated in the past (the user's one). >>>>>>> >>>>>>> Strings in the user catalog would use a "worse" plural equation, >>>>>>> while strings in the Django catalog will use the better one. Updates >>>>>>> won't >>>>>>> reach to users' catalogs, unless they explicitly update them, because >>>>>>> once >>>>>>> the user's po is created, makemessages won't update the header, it will >>>>>>> do >>>>>>> a msgmerge ( >>>>>>> https://github.com/django/django/blob/master/django/core/management/commands/makemessages.py#L603 >>>>>>> ). >>>>>>> >>>>>>> That's why something like "makemessages --update-plural-form" and a >>>>>>> warning about different plurals forms in the catalog would be needed, >>>>>>> even >>>>>>> under "dict-merge". >>>>>>> >>>>>>> With the current "merge" policy, because only the main form used, >>>>>>> the users would get the update but they won't be able to have parts of >>>>>>> the >>>>>>> catalog under a different pf. >>>>>>> >>>>>>> The way to have a different plural form than the main one with the >>>>>>> current code base would be by having a custom variant of the language >>>>>>> (which implements "the spirit of dict-merge": independent catalogs with >>>>>>> an >>>>>>> order of precedence). It's not ideal, you will loose the ability of >>>>>>> using >>>>>>> the browser's locale config to display the language out of the box >>>>>>> unless >>>>>>> you add extra code to handle it, something like "if locale == 'he': use >>>>>>> 'he_SP'" (no other caveat comes to my mind). But it would be they way of >>>>>>> having it RN without loosing the updates. >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Claude >>>>>>>> >>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "Django developers (Contributions to Django itself)" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to django-developers+unsubscr...@googlegroups.com. >>>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/django-developers/5f2bf52a-997c-467d-b927-f2959507d32e%40googlegroups.com >>>>>>>> <https://groups.google.com/d/msgid/django-developers/5f2bf52a-997c-467d-b927-f2959507d32e%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>>> >>>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "Django developers (Contributions to Django itself)" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to django-developers+unsubscr...@googlegroups.com. >>>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/django-developers/CA%2BFDnhKtEB4sgUFwmyaExGLSnMU5yLkj0QZ1ubpF2gaRjBd1EQ%40mail.gmail.com >>>>>> <https://groups.google.com/d/msgid/django-developers/CA%2BFDnhKtEB4sgUFwmyaExGLSnMU5yLkj0QZ1ubpF2gaRjBd1EQ%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "Django developers (Contributions to Django itself)" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to django-developers+unsubscr...@googlegroups.com. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/django-developers/CALYYG80tyJdmEw1LJaV0sa6OHQ0HMkecO2wsHJkvvyQ4%3DT0Tzw%40mail.gmail.com >>>>> <https://groups.google.com/d/msgid/django-developers/CALYYG80tyJdmEw1LJaV0sa6OHQ0HMkecO2wsHJkvvyQ4%3DT0Tzw%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>> -- You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CA%2BFDnh%2B61VoF-4k_zQoxX5hrVH40sVL1acALnWr-AFX2SWkG_g%40mail.gmail.com.