> Hi again,
> I've learnt some Java by making tools for adding new entries to the
> dictionaries for a language pair.
> The intended audience is Swedes and Danes wanting to contribute to the
> pair Swedish-Danish (sv-da).
>
> The text and comments etc are mainly in Swedish, but I think anyone can
> figure out how to use the programs.
> You'll find the programs here:
> http://www.tunedal.nu/download/AddToDix/
>
> The idea is to support a work process:
>
> 1. Make a frequency list of collected words for some domain
> (only new words will be in the list)
> 2. Make a monodix-file for the source language
> (only new words can be added)
> 3. Add the translation to a bidix-file (the monodix is read by the
> program)
> 4. Send the created files to the language developer for the pair.
> 5. Let the developer check the contribution and add the new words.
>
> A side effect is that it will be unnecessary to sort the dictionaries!
> You can be sure that you don't add some word already present and new
> words can be pasted anywhere in the dictionary files.
>
> The programs are in an early version and might contain errors, but they
> can produce useful output. I have skipped all the tricky cases.
> I apologize for the messy code, I began writing it before I learnt
> object orientation.
>
> todo:
> - It would be very nice to exclude all forms of the words already
> present,
> from the frequency list (now only the lemmas are excluded).
> - It would be nice to have the dictionaries updated from the repository,
> rather than distributed with the programs
> - Internationalisation: translation files for the programs
> - An installer
> - A nice GUI
>
> I appreciate any comments, suggestions and help with this.
>

Regarding the "excluding all forms":

1) You could just pass the frequency list through apertium-destxt |
lt-proc | apertium-retxt  before reading it with the program, and then
only include lines with '*' in.

2) You could check the list after loading using some functionality from
lttoolbox-java. There is probably an equivalent to biltransWithQueue or
something like this.

Fran


------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. ON SALE this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122712
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to