It's my pleasure to announce the beta release (as in we don't think
there are any remaining *major* bugs but, we would like others to poke
and prod at it to see if they can make any bugs show up) of my Google
Summer of Code project, the completion of the Java port of the
Apertium runtime. A 20,000 sentence corpus was passed through the Java
runtime using the en-eo (English to Esperanto) to help test for any
remaining showstopper bugs.

Jacob timed running the same 100 sentences from a corpus through the
eo-en language pair (from English to Esperanto) in both the Java and
C++ runtimes to test performance, and got the following results
(apertium-j, the first one, is the Java runtime):

$ head -100 ../apertium-eo-en/corpa/en.crp.txt | time apertium-j -d
../apertium-eo-en en-eo
 1.01user 0.11system 0:01.15elapsed 97%CPU (0avgtext+0avgdata 162000maxresident)

$ head -100 ../apertium-eo-en/corpa/en.crp.txt | time apertium -d
../apertium-eo-en en-eo
 0.72user 0.06system 0:00.57elapsed 136%CPU (0avgtext+0avgdata 52800maxresident)

Translation of 100 sentences took 0.72 seconds using standard Apertium
and 1.01 seconds using the Java port.
It seems that the Java port is only about 1.5 times slower than the
standard C++ Apertium, which came as a surprise, as we had not put
forth any extra effort to make our code fast or efficient.

Some caveats.
1) You'll need to download it from SVN trunk to test it.
(https://apertium.svn.sourceforge.net/svnroot/apertium/trunk/lttoolbox-java)
2) You'll need JDK 1.6 or later, as the Java runtime dynamically
compiles the XML (.t*x) files to java bytecode if they don't already
exist in either the same directory as the .bin files being used, or
the system temp directory. This will also require that lttoolbox.jar
has been built, which you will need to do.
3) Right now many exceptions will dump a traceback to your console.
These are useful for tracking down bugs. ;)

-- Stephen

------------------------------------------------------------------------------
This SF.net Dev2Dev email is sponsored by:

Show off your parallel programming skills.
Enter the Intel(R) Threading Challenge 2010.
http://p.sf.net/sfu/intel-thread-sfd
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to