Recently I was browsing some bugs in "Core::Spelling checker" and much to my surprise found four bugs where people complained about wrong or missing words in the en-US dictionary. There were two bugs where people complained about words in the German and the French dictionaries.

The German and French bugs were finally closed as "wontfix" and "invalid" and referred back to the respective dictionary maintainers. For French there is a very good a approach: The French dictionaries are maintained via this site: http://www.dicollecte.org/ and imported for distribution with the French version of Firefox. The situation for German is not as good, but there is a maintainer whose work is then turned into an add-on (in fact, sadly, two competing ones).

I was extremely surprised that Mozilla maintains a version of the en-US dictionary, and you can see the movements here: https://hg.mozilla.org/mozilla-central/log/tip/extensions/spellcheck/locales/en-US/hunspell/en-US.dic

Basically Ekanan Ketunuti does merges from upstream providers (SCOWL) but also adds words individually and Ehsan reviews each change.

I think this situation is less than ideal. Firstly, I don't think we should spend time on individual additions, and secondly, this process creates quite some unwanted variations (to avoid using the word "mess"). For example the en-US dictionary add-on available at AMO contains many accented words loaned from other languages, like "Bogotá" or "cliché" (both with Wikipedia entries), which the Mozilla dictionary is missing. Also, subtle differences are created, for example, the add-on dictionary has "(in/un)feasible" and "(un)feasibly", whereas the Mozilla version only had "(un)feasible" and "feasibly" (no prefix). A bug is necessary to correct this. Thirdly, the add-on dictionary contains 13% more words than the Mozilla maintained dictionary, and I think in dictionaries, bigger is better. For example, the Mozilla dictionary only knows "zucchini", whereas the add-on dictionary also knows "Zulu" and other words starting with "zu". I'd hate to think that we'd need to create 7265 bugs to add all the missing words.

Is there a better way to do this? I think this is tedious business and Mozilla should get out of it.

Jorg K.


_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to