Hi Jonathan, Fran, Thanks for looking at this, I really appreciate it :)
From those two options, I think the first would be better. If I got it right, the 1st is pure segmentation while the 2nd inserts an additional д.
Segmenting поезде as поез>де (1st option) would allow us to recover the original word easily from the segmented version. Segmenting as поезд>де (2nd option) would not as we may recover the original word wrongly as поездде.
best, a. On 06/03/2019 23:00, Francis Tyers wrote:
El 2019-03-06 21:51, Jonathan Washington escribió:Hi Antonio, I have something mostly working, but have a few questions about what specifically you're after. I guess to start things out, for поезд<n><loc>:поезд>{D}{A}:поезде, do you prefer поез>де or поезд>де, or something else?To clarify: поезд "train" is the lemma, -{D}{A} is the morpheme for locative, -де is the morph поез is a substring There is a rule that the final -д of a word deletes in certain cases if the following morpheme starts with {D} (or something like that). So, the "exact string" would give you a non-word (поез) but if you want each part to be an actual word you'd need (поезд) but then you wouldn't get an exact string match with removing the morpheme boundaries. Fran
_______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
