El 2020-06-15 17:38, Hèctor Alòs i Font escribió:
Here come several practical examples. I tried to select them for their variety. The result is more a wish list than something structured.Let's begin with "je la baise". Depending on the context this may be "I kiss her" or "I fuck her". The context can tell us if we are in a formal or colloquial type of language. Another issue is that in this case the anaphora resolution can also help us: if the pronoun reference is "hand", it can only be "kiss"; if it is a person, the doubt persists. Another kind of problem is the Arpitan words "chamô" ("camel"; plural "camels") and "chamôs ("chamois"; unchanged in plural). So, translating into French, I got yesterday chamois in a Bible text of Exodus xD I solved it deciding in a CG rule that all "chamôs" (without nothing around in singular) are camels. (Similar cases in French: fil/fils, foi/fois, cour/cours) In French there are plenty of words with different meanings, depending on the genre: livre, page, tour, etc. The problem is that often the immediate surrounding context does not disambiguate: des livres, les pages, de tour, etc. A similar but slightly different case is the word pairs homicide mf/homicide m, féminicide mf/féminicide m, parricide mf/parricide, etc.: the one with the genre "mf" is a person and the other is the action. Other problems come in lexical selection. For instance, as a rule, Catalan preposition "de" is translated as "de" in French, but if the following word is a material, "en" must be selected (de fusta > en bois). So in the Catalan2French lrx file we have a list of materials, as we have a list of countries, a list of musical instruments, a list of animals, etc. I dream about a monolingual dictionary where we could get this kind of information. It is not useful to have these lists for many language pairs using Catalan. This information should be in apertium-cat and not in every apertium-cat-xxx lrx file. Moreover, If we had words not only with different kind of semantic labels, but also marked as synonyms, maybe it'd be possible to give a translation using a word labeled as synonym (if it has a translation) instead of "unknown".
These are excellent examples, I'm just about to go out, but will address them when I get back. Thanks for the ideas.. Note that my suggestion was to include this information in the monolingual packages. Fran _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
