Re: [Apertium-stuff] morph segmentation with Apertium

Antonio Toral Thu, 07 Mar 2019 01:24:01 -0800

Hi Jonathan, Fran,

Thanks for looking at this, I really appreciate it :)

From those two options, I think the first would be better. If I got itright, the 1st is pure segmentation while the 2nd inserts an additional д.

Segmenting поезде as поез>де (1st option) would allow us to recover theoriginal word easily from the segmented version. Segmenting as поезд>де(2nd option) would not as we may recover the original word wrongly asпоездде.



best,

a.

On 06/03/2019 23:00, Francis Tyers wrote:

El 2019-03-06 21:51, Jonathan Washington escribió:

Hi Antonio,

I have something mostly working, but have a few questions about what
specifically you're after.

I guess to start things out, for
поезд<n><loc>:поезд>{D}{A}:поезде, do you prefer
поез>де or поезд>де, or something else?


To clarify:

поезд "train" is the lemma,
-{D}{A} is the morpheme for locative,
-де is the morph
поез is a substring

There is a rule that the final -д of a word deletes in certain cases
if the following morpheme starts with {D} (or something like that).

So, the "exact string" would give you a non-word (поез) but
if you want each part to be an actual word you'd need (поезд) but
then you wouldn't get an exact string match with removing the morpheme
boundaries.

Fran



_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Re: [Apertium-stuff] morph segmentation with Apertium

Reply via email to