It's my pleasure to announce the beta release (as in we don't think there are any remaining *major* bugs but, we would like others to poke and prod at it to see if they can make any bugs show up) of my Google Summer of Code project, the completion of the Java port of the Apertium runtime. A 20,000 sentence corpus was passed through the Java runtime using the en-eo (English to Esperanto) to help test for any remaining showstopper bugs.
Jacob timed running the same 100 sentences from a corpus through the eo-en language pair (from English to Esperanto) in both the Java and C++ runtimes to test performance, and got the following results (apertium-j, the first one, is the Java runtime): $ head -100 ../apertium-eo-en/corpa/en.crp.txt | time apertium-j -d ../apertium-eo-en en-eo 1.01user 0.11system 0:01.15elapsed 97%CPU (0avgtext+0avgdata 162000maxresident) $ head -100 ../apertium-eo-en/corpa/en.crp.txt | time apertium -d ../apertium-eo-en en-eo 0.72user 0.06system 0:00.57elapsed 136%CPU (0avgtext+0avgdata 52800maxresident) Translation of 100 sentences took 0.72 seconds using standard Apertium and 1.01 seconds using the Java port. It seems that the Java port is only about 1.5 times slower than the standard C++ Apertium, which came as a surprise, as we had not put forth any extra effort to make our code fast or efficient. Some caveats. 1) You'll need to download it from SVN trunk to test it. (https://apertium.svn.sourceforge.net/svnroot/apertium/trunk/lttoolbox-java) 2) You'll need JDK 1.6 or later, as the Java runtime dynamically compiles the XML (.t*x) files to java bytecode if they don't already exist in either the same directory as the .bin files being used, or the system temp directory. This will also require that lttoolbox.jar has been built, which you will need to do. 3) Right now many exceptions will dump a traceback to your console. These are useful for tracking down bugs. ;) -- Stephen ------------------------------------------------------------------------------ This SF.net Dev2Dev email is sponsored by: Show off your parallel programming skills. Enter the Intel(R) Threading Challenge 2010. http://p.sf.net/sfu/intel-thread-sfd _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
