René, On Tuesday, July 15, 2025 11:38:37 PM Mountain Standard Time René Engelhard wrote: > >Country or region specific dictionaries should only exist if they actually > >contain distinct country or region specific information. So, for example, if > >upstream shipped a nl_BE.dic that was different than the main nl.dic, then > >that file should be shipped in Debian. In this case, the upstream project > >does not produce any country or region specific dictionaries, but rather only > >one language dictionary, which they name nl.dic. > > That would be ideal, yes. > > That is just not how it works... It worked the current way since the 2000s. > No > reason to immediately change it now.
I lived for two years in Perú and speak fairly decent Spanish. When I go to enable Spanish spell checking in LibreOffice, I see the following list of options: Spanish (Argentina) Spanish (Bolivia) Spanish (Chile) Spanish (Columbia) Spanish (Costa Rica) Spanish (Dom. Rep.) Spanish (Ecuador) Spanish (El Salvador) Spanish (Guatemala) Spanish (Honduras) Spanish (Mexico) Spanish (Nicaragua) Spanish (Panama) Spanish (Paraguay) Spanish (Peru) Spanish (Puerto Rico) Spanish (Spain) Spanish (Uruguay) Spanish (Venezuela) Before I had looked into this closely, I assumed this meant that Debian was shipping distinct dictionaries for each of these countries. There is some regional variation in Spanish. Argentina, for example, uses a verb tense that it not common elsewhere. Mexico, expecially Mexico City, uses a lot of slang that approaches an official custom vocabulary. In other cases, there is very little variation. I think anyone would be hard pressed to describe any dictionary differences between Peru and Ecuador or Bolivia. But the truth is that this entire list is a farce. There is not a single difference in LibreOffice between selecting any of them. They are all just symlinks to es_ES.dic (Spain). So, even if I select Argentina because I want Argentina specific spell checking, I don’t get it. I consider this to be false advertising. In the case of Dutch, the upstream project does not produce any country- independent Hunspell dictionaries. They ship one dictionary named nl.dic. https://github.com/OpenTaal/opentaal-hunspell/ They do not ship four separate dictionaries for the Netherlands (nl_NL), Belgium (nl_BE), Aruba (nl_AW), and Suriname (nl_SR), which were the four symlinks shipped previously. Interestingly, these are not a complete list of possible Dutch country/region specific codes, just like the above Spanish list is not a complete list of all the possible codes. For Dutch, at lest those in the following link are possible: https://localizely.com/language-code/nl/ In the case of Dutch, LibreOffice only recognizes two language codes, nl_NL and nl_BE, which makes the previous shipping of nl_AW and nl_SR in Debian superfluous. The change I have already made to the package, and which has already been unblocked to migrate to testing, is to ship one language specific code: nl_NL. This preserves the ability of Dutch users of LibreOffice to enable Dutch spell checking. In LibreOffice’s GUI, it lists this language as Dutch (Netherlands). This is not completely accurate as there is nothing about the dictionary we are shipping that has specific information about the Netherlands in it. So, I consider doing so a temporary workaround. However, it is more accurate than shipping two symlinks, one for nl_NL and the other for nl_BE. That would falsely advertise that two separate dictionaries are provided. The solution to this problem is for LibreOffice to correctly enumerate the Hunspell dictionaries present on the system, including any dictionaries without country/region specific codes. As suggested, I will file an upstream LibreOffice bug and reference this bug and #1109355. -- Soren Stoutner so...@debian.org
signature.asc
Description: This is a digitally signed message part.