Did you found any solution for the merging lines? I found this problem translating the Europarl corpus (spa>cat). This problem didn't happen before, but it happens now with the current nightly version. It turns out that the sentences where the merging occurs (or starts?) contain a 'soft hyphen' character (U+00AD). Removing this character (in fact, it should be replaced by an em dash), there is no merging.
Another change in behavior I have noticed is related to characters not in the language alphabet. Before it was: Kwaśniewski > *Kwaś*niewski Now it is: Kwaśniewski > *Kwaśniewski Which is preferable. Jaume Ortolà
_______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
