Hello, I've been trying to develop HFST and TWOL files for the Uzbek language by looking at how other similar languages (Tatar, Kazakh, etc.) have done it. Those language rules are very complex, at least for someone who doesn't know where to start reading. I usually look for a word and then go backwords deciphering the rule chain to make sense of it. The chain gets so long that I start forgetting the start of the rules. So copying and pasting existing solutions and modifying them didn't appeal to me. That's why I started adding simple rules first and then expanding them for each use case. You can see my progress at [1] and [2] (My previous work using the DIX format got so out of hand that I gave up developing it.).
As I keep adding or changing more and more rules to fit new usecases, I realize that I maybe breaking old usecases. That's why I'd like to create test cases first and then change the rules and not be worried that I broke any previous work. Are there any such tools that you use? [1] https://github.com/bmansurov/apertium-uzb/blob/master/apertium-uzb. uzb_Cyrl.lexc [2] https://github.com/bmansurov/apertium-uzb/blob/master/apertium-uzb. uzb_Cyrl.twol -- Bahodir ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
