Am Tue, 17 Jan 2017 19:48:31 +0300
schrieb mansur <[email protected]>:

> Hello, Tommi!
> 
> 1) Unfortunately I couldn't get this script to work, because I am not
> so good in Apertium and HFST commands and their syntax :)

Ok, there was a bug in the script as well. I tested it now with real
thing and some example is here: <http://paste2.org/7YjOayfI>. (For the
archives, the script I used here is:

$ for lemma in абзый абруй абсолют абый ; do  echo $lemma; echo $lemma
| sed -e 's/./\0 /g' | sed -e 's/$/ %<n%> ?*/' | hfst-regexp2fst -o
temp.hfst; hfst-compose temp.hfst .deps/tat.RL.hfst -o gen.hfst;
hfst-fst2strings gen.hfst | cut -f 2; done

just in case paste2 will disappear and someone finds this message by
internet search or something)

> 2) Terabytes of word-forms? Wow, that is quite much :)

It is indeed, can sometimes still be used as an argument against the
word-form list / database morphology approach. Btw, the above
experiment generated 9 megabytes of word-forms from 4 noun lexemes,
maybe they aren't what would be generally be wanted for "all
word-forms", but it is likely apertium-tat won't be much worse for full
lexicon in the end.



-- 
Doktor Tommi A Pirinen, Computational Linguist,
<https://flammie.github.io/purplemonkeydishwasher/>, Universität
Hamburg, Hamburger Zentrum für Sprachkorpora <http://hzsk.de>. CLARIN-D
Entwickler.  President of ACL SIGUR SIG for Uralic languages
<http://gtweb.uit.no/sigur/>.
I tend to follow inline-posting style in desktop e-mail messages.



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to