All of this can be worked during Incubation and I think we have the right folks here who can help to get it set up.
Cheers, Chris ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----Original Message----- From: Henry Saputra <henry.sapu...@gmail.com> Reply-To: "general@incubator.apache.org" <general@incubator.apache.org> Date: Wednesday, January 20, 2016 at 10:33 AM To: "general@incubator.apache.org" <general@incubator.apache.org> Subject: Re: [DISCUSS] Apache Joshua Incubator Proposal - Machine Translation Toolkit >This is a bit tricky and I suppose we could leave behind the GPL/LGPL >dependencies that are used for model building when generating releases >under ASF license. >I hope that will work. > >- Henry > >On Wed, Jan 20, 2016 at 7:51 AM, Matt Post <p...@cs.jhu.edu> wrote: > >> I imagine so. Model building is very technical and resource intensive >>and >> something only a few people will want or need to do. Working on and >>running >> the decoder (#2) should be the much more common use case, and with the >> (included, Apache-licensed) Berkeley LM, that can be done without the >>need >> for any external dependencies. >> >> >> > On Jan 20, 2016, at 10:46 AM, Alex Harui <aha...@adobe.com> wrote: >> > >> > External is good news. I'm not sure how much leeway there is in the >> > following quote from [1], but what percentage of your users are >>currently >> > using an all-ASF-compatible set of projects? >> > >> > The question to ask yourself in this situation is: >> > * "Will the majority of users want to use my >> > product without adding the optional components?" >> > >> > -Alex >> > >> > [1] http://www.apache.org/legal/resolved.html >> > >> > >> > On 1/20/16, 7:17 AM, "Matt Post" <p...@cs.jhu.edu> wrote: >> > >> >> The dependencies can be split into two kinds: ones required for >>building >> >> new models, and ones needed by the decoder to translate new sentences >> >> with a pre-built model (i.e., black-box translation with the language >> >> packs). >> >> >> >> 1. For building new models, you need a way to align the words between >> >> sentences in parallel text. Both the aligners used by Joshua (GIZA++ >>and >> >> the Berkeley aligner) are GPL of some form. These can be implemented >>as >> >> external dependencies, or can be replaced with another aligner, like >> >> fast_align (https://github.com/clab/fast_align), which is >> >> Apache-licensed. There are many other options, in fact. So this >>should >> >> not be a worry. >> >> >> >> 2. For doing black-box translation, one needs to represent the >>language >> >> model, which is very large. The best tool for this is KenLM >> >> (github.com/kpu/kenlm), which is LGPL 2.1. There is also BerkeleyLM, >> >> which is just as good for practical purposes and is Apache-licensed. >> >> KenLM is C++ and is loaded via the JNI, whereas BerkeleyLM is >>written in >> >> Java. I have moved to including BerkeleyLM in language packs, >>because I >> >> can then include the Joshua-runtime, and people can translate without >> >> even having to compile anything. >> >> >> >> So in short, there are no hard dependencies on unfavorably-licensed >> >> external projects. >> >> >> >> matt >> >> >> >> >> >> >> >> >> >>> On Jan 20, 2016, at 10:08 AM, Mattmann, Chris A (3980) >> >>> <chris.a.mattm...@jpl.nasa.gov> wrote: >> >>> >> >>> Hey Hen, >> >>> >> >>> Matt Post who I believe is monitoring this list and who has >> >>> been one of the key Joshua developers and I have discussed this >> >>> and we believe that potentially GPL/LGPL dependencies can: >> >>> >> >>> 1. be replaced with category-A or category-B alternatives. Matt >> >>> mentioned one already to me which has slipped my mind. >> >>> 2. be made in such a way that they are external tools and the >> >>> bindings exist in Joshua to call those external tools (aka runtime >> >>> deps akin to depending on a C compiler, etc.) >> >>> >> >>> Cheers, >> >>> Chris >> >>> >> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> >>> Chris Mattmann, Ph.D. >> >>> Chief Architect >> >>> Instrument Software and Science Data Systems Section (398) >> >>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >> >>> Office: 168-519, Mailstop: 168-527 >> >>> Email: chris.a.mattm...@nasa.gov >> >>> WWW: http://sunset.usc.edu/~mattmann/ >> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> >>> Adjunct Associate Professor, Computer Science Department >> >>> University of Southern California, Los Angeles, CA 90089 USA >> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org >> For additional commands, e-mail: general-h...@incubator.apache.org >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org