René,

On Tuesday, July 15, 2025 11:38:37 PM Mountain Standard Time René Engelhard 
wrote:
> >Country or region specific dictionaries should only exist if they actually
> >contain distinct country or region specific information.  So, for example, 
if
> >upstream shipped a nl_BE.dic that was different than the main nl.dic, then
> >that file should be shipped in Debian.  In this case, the upstream project
> >does not produce any country or region specific dictionaries, but rather 
only
> >one language dictionary, which they name nl.dic.
> 
> That would be ideal, yes.
> 
> That is just not how it works... It worked the current way since the 2000s. 
> No
> reason to immediately change it now.

I lived for two years in Perú and speak fairly decent Spanish.  When I go to 
enable Spanish spell checking in LibreOffice, I see the following list of 
options:

Spanish (Argentina)
Spanish (Bolivia)
Spanish (Chile)
Spanish (Columbia)
Spanish (Costa Rica)
Spanish (Dom. Rep.)
Spanish (Ecuador)
Spanish (El Salvador)
Spanish (Guatemala)
Spanish (Honduras)
Spanish (Mexico)
Spanish (Nicaragua)
Spanish (Panama)
Spanish (Paraguay)
Spanish (Peru)
Spanish (Puerto Rico)
Spanish (Spain)
Spanish (Uruguay)
Spanish (Venezuela)

Before I had looked into this closely, I assumed this meant that Debian was 
shipping distinct dictionaries for each of these countries.  There is some 
regional variation in Spanish.  Argentina, for example, uses a verb tense that 
it not common elsewhere.  Mexico, expecially Mexico City, uses a lot of slang 
that approaches an official custom vocabulary.

In other cases, there is very little variation.  I think anyone would be hard 
pressed to describe any dictionary differences between Peru and Ecuador or 
Bolivia.

But the truth is that this entire list is a farce.  There is not a single 
difference in LibreOffice between selecting any of them.  They are all just 
symlinks to es_ES.dic (Spain).  So, even if I select Argentina because I want 
Argentina specific spell checking, I don’t get it.

I consider this to be false advertising.

In the case of Dutch, the upstream project does not produce any country-
independent Hunspell dictionaries.  They ship one dictionary named nl.dic.

https://github.com/OpenTaal/opentaal-hunspell/

They do not ship four separate dictionaries for the Netherlands (nl_NL), 
Belgium (nl_BE), Aruba (nl_AW), and Suriname (nl_SR), which were the four 
symlinks shipped previously.  Interestingly, these are not a complete list of 
possible Dutch country/region specific codes, just like the above Spanish list 
is not a complete list of all the possible codes.  For Dutch, at lest those in 
the following link are possible:

https://localizely.com/language-code/nl/

In the case of Dutch, LibreOffice only recognizes two language codes, nl_NL 
and nl_BE, which makes the previous shipping of nl_AW and nl_SR in Debian 
superfluous.

The change I have already made to the package, and which has already been 
unblocked to migrate to testing, is to ship one language specific code:  
nl_NL.  This preserves the ability of Dutch users of LibreOffice to enable 
Dutch spell checking.  In LibreOffice’s GUI, it lists this language as Dutch 
(Netherlands).  This is not completely accurate as there is nothing about the 
dictionary we are shipping that has specific information about the Netherlands 
in it.  So, I consider doing so a temporary workaround.  However, it is more 
accurate than shipping two symlinks, one for nl_NL and the other for nl_BE.  
That would falsely advertise that two separate dictionaries are provided.

The solution to this problem is for LibreOffice to correctly enumerate the 
Hunspell dictionaries present on the system, including any dictionaries 
without country/region specific codes.  As suggested, I will file an upstream 
LibreOffice bug and reference this bug and #1109355.

-- 
Soren Stoutner
so...@debian.org

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply via email to