Hi , thanks for your comments. See below. Yours, Per Tunedal On Thu, Jan 3, 2013, at 12:29, Francis Tyers wrote: > > Hi again, > > I've learnt some Java by making tools for adding new entries to the > > dictionaries for a language pair. > > The intended audience is Swedes and Danes wanting to contribute to the > > pair Swedish-Danish (sv-da). > > --snip-- > > 1. Make a frequency list of collected words for some domain > > (only new words will be in the list) --snip-- > > todo: > > - It would be very nice to exclude all forms of the words already > > present, > > from the frequency list (now only the lemmas are excluded). --snip-- > > Regarding the "excluding all forms": > > 1) You could just pass the frequency list through apertium-destxt | > lt-proc | apertium-retxt before reading it with the program, and then > only include lines with '*' in. The intended audience will be Windows users. Using apertium is not an option, I still haven't got a working Windows installation. > > 2) You could check the list after loading using some functionality from > lttoolbox-java. There is probably an equivalent to biltransWithQueue or > something like this. I'm not familiar with biltransWithQueue: what is it supposed to do?
Yes, lttoolbox-java might to the trick. I'm not familiar with it, though. I thought it might be possible to borrow some functionality from e.g. apertium-caffeine. But I don't want to use the released versions of the dictionary files, but rather the most recent versions from SVN. > > Fran > > --snip-- ------------------------------------------------------------------------------ Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft MVPs and experts. ON SALE this month only -- learn more at: http://p.sf.net/sfu/learnmore_122712 _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
