Hi all, My GSoC proposed project is "Robust tokenization in lttoolbox"[1]. These days I have further studied source files of the lexical analysis tool *lt-proc *of lttoolbox. Now, about the way how lttoolbox realizes the operation of tokenization, I have a preliminary idea.
My idea is recorded at the README page of my repo[2]. I will be very grateful if there will be some people who could give me feedback. [1] http://wiki.apertium.org/wiki/Ideas_for_Google_Summer_of_Code/Robust_tokenisation [2] https://github.com/GavinWz/Apertium/blob/master/README.md#my-idea-of-tokenization-flow-path
_______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
