> Hi again, > I've learnt some Java by making tools for adding new entries to the > dictionaries for a language pair. > The intended audience is Swedes and Danes wanting to contribute to the > pair Swedish-Danish (sv-da). > > The text and comments etc are mainly in Swedish, but I think anyone can > figure out how to use the programs. > You'll find the programs here: > http://www.tunedal.nu/download/AddToDix/ > > The idea is to support a work process: > > 1. Make a frequency list of collected words for some domain > (only new words will be in the list) > 2. Make a monodix-file for the source language > (only new words can be added) > 3. Add the translation to a bidix-file (the monodix is read by the > program) > 4. Send the created files to the language developer for the pair. > 5. Let the developer check the contribution and add the new words. > > A side effect is that it will be unnecessary to sort the dictionaries! > You can be sure that you don't add some word already present and new > words can be pasted anywhere in the dictionary files. > > The programs are in an early version and might contain errors, but they > can produce useful output. I have skipped all the tricky cases. > I apologize for the messy code, I began writing it before I learnt > object orientation. > > todo: > - It would be very nice to exclude all forms of the words already > present, > from the frequency list (now only the lemmas are excluded). > - It would be nice to have the dictionaries updated from the repository, > rather than distributed with the programs > - Internationalisation: translation files for the programs > - An installer > - A nice GUI > > I appreciate any comments, suggestions and help with this. >
Regarding the "excluding all forms": 1) You could just pass the frequency list through apertium-destxt | lt-proc | apertium-retxt before reading it with the program, and then only include lines with '*' in. 2) You could check the list after loading using some functionality from lttoolbox-java. There is probably an equivalent to biltransWithQueue or something like this. Fran ------------------------------------------------------------------------------ Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft MVPs and experts. ON SALE this month only -- learn more at: http://p.sf.net/sfu/learnmore_122712 _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
