> Cool, I've been working with some language pairs I know, I have few
> questions on cli usage and stuff:
>
> Commonly when I test things I get like:
>
> > Corpus 1 of 5: deu-fin-pending
> > 11/27 (40.74%) tests pass (11/11 (100.0%) match gold)
>
> so I start up cli and see:
>
> deu-fin 1 of 1
> INPUT:
> Haus
> EXPECTED OUTPUT:
> KOTI
> ACTUAL OUTPUT:
> koti
> IDEAL OUTPUTS:
> talo
>
> if the case was that it's a bug in the (dix/t*x) code I'm not sure what
> command to use to skip and see next error?
There is now 'skip' or 'k' for that.
>
> Another question is that in lot of expected files there seems to be
> all-capsed words for fin-* pairs, I am not sure how this has happened?
> I am guessing my apertium is older and some ICU changes have affected
> the output, perhaps in some code I've copypasted ununderstandingly to
> all fin-* pairs.
>
For that example in particular (and likely others as well), the t1x
output code says
<chunk name="NP" case="caseFirstWord">
which gives the chunk an all-caps lemma (caseFirstWord is a
non-existent variable, so it has no effect) and then the default
behavior of postchunk is to copy the chunk case to the words, so
^NP<N><FOOFOO>{^koti<n><sg><nom>$}$
becomes
^KOTI<n><sg><nom>$^.<punct>$
so I think this might be a case of inadvertently relying on a bug in
the old transfer case-handling functions and I'm not quite sure what
the appropriate solution is.
Daniel
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff