Recently I was browsing some bugs in "Core::Spelling checker" and much
to my surprise found four bugs where people complained about wrong or
missing words in the en-US dictionary. There were two bugs where people
complained about words in the German and the French dictionaries.
The German and French bugs were finally closed as "wontfix" and
"invalid" and referred back to the respective dictionary maintainers.
For French there is a very good a approach: The French dictionaries are
maintained via this site: http://www.dicollecte.org/ and imported for
distribution with the French version of Firefox. The situation for
German is not as good, but there is a maintainer whose work is then
turned into an add-on (in fact, sadly, two competing ones).
I was extremely surprised that Mozilla maintains a version of the en-US
dictionary, and you can see the movements here:
https://hg.mozilla.org/mozilla-central/log/tip/extensions/spellcheck/locales/en-US/hunspell/en-US.dic
Basically Ekanan Ketunuti does merges from upstream providers (SCOWL)
but also adds words individually and Ehsan reviews each change.
I think this situation is less than ideal. Firstly, I don't think we
should spend time on individual additions, and secondly, this process
creates quite some unwanted variations (to avoid using the word "mess").
For example the en-US dictionary add-on available at AMO contains many
accented words loaned from other languages, like "Bogotá" or "cliché"
(both with Wikipedia entries), which the Mozilla dictionary is missing.
Also, subtle differences are created, for example, the add-on dictionary
has "(in/un)feasible" and "(un)feasibly", whereas the Mozilla version
only had "(un)feasible" and "feasibly" (no prefix). A bug is necessary
to correct this. Thirdly, the add-on dictionary contains 13% more words
than the Mozilla maintained dictionary, and I think in dictionaries,
bigger is better. For example, the Mozilla dictionary only knows
"zucchini", whereas the add-on dictionary also knows "Zulu" and other
words starting with "zu". I'd hate to think that we'd need to create
7265 bugs to add all the missing words.
Is there a better way to do this? I think this is tedious business and
Mozilla should get out of it.
Jorg K.
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform