El 2018-01-08 11:31, Marc Riera Irigoyen escribió:
Hello all,

I've noticed some weird behaviour when generating English output in
the English-Catalan pair. When translating "àrtic" to English, for
example, the adjective "Arctic" is found, but right at the end of the
pipeline, in the English generator, "Arctic<adj>" becomes
Arctic/Arctic and is sent as is in the output text.

The entry in the English monodix is "artic" without caps, but this was
the case in the old en-ca pair and it worked. Moreover, I've altered
the bidix to simulate the same situation when generating to Catalan
and it works as expected, so it looks like it's specific to English.

Anyone who knows what's going on here?


<e lm="Arctic"> <par n="Aa"/><i>rctic</i><par n="expensive__adj"/></e>
<e lm="arctic">          <i>arctic</i><par n="expensive__adj"/></e>

two entries. My proposal is to keep the first one and delete the second one.

For adjectives/nouns that must always be written in uppercase then the first style of entry should be used. This avoids generation issues when the case changing in transfer means that some open class comes out in the wrong orthographic case.

Fran

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to