(re-sent due to incorrect CC address in last post) Hi NOKUBI, Thank you for working on this. Although it may sound boring or even frustrating, data used for training machine learning models, or pre-trained machine learning models should be carefully dealt with.
Your copyright file is not complete https://bitbucket.org/tsuchm/pkg-sentencepiece/src/master/debian/copyright at least one file in data/ directory are not apache-2.0 licensed: https://github.com/google/sentencepiece/blob/master/data/botchan.txt#L1-L12 https://github.com/google/sentencepiece/blob/master/data/Scripts.txt#L1-L13 and I'm wondering whether the Japanese poetry book is free: (I don't speak Japanese but from the "Chinese characters" within the text I guess it's a poetry book) https://raw.githubusercontent.com/google/sentencepiece/master/data/wagahaiwa_nekodearu.txt as its publisher is 青空文庫. Please confirm the copyright information for this book and its DFSG compliance. When there are DFSG-incompatible stuff in a source package, a common practice in Debian is to strip those components from the original tarballs and prefix the version string with +dfsg. However, data-driven applications could become useless when the training data was removed... This is an awkward difficulty, or say conflict in practice between free software world and the academical machine learning (computational linguistics) community. Besides, the packaging of tensorflow is stalled, as it's difficult to tame the 4.5 million lines of code without a usable build system. For a long time the users (including myself) have to (somewhat) depend on third party ecosystems until the day Google started to rethink about distribution integration (basically hopeless). Apart from the science team, you are welcome to join the deep learning team as well: https://salsa.debian.org/deeplearning-team (it's an informal team) On 2019-10-03 02:37, NOKUBI Takatsugu wrote: > On Wed, 02 Oct 2019 14:52:23 +0900, > Kentaro Hayashi wrote: >> * Vcs : https://salsa.debian.org/debian/sentencepiece > > It contains tensorflow binding, so I think it will be good to belong > with Debian Science Team. > > I, hayashi-san, and tsuchiya-san sent requests to join the team. > tsuchiya-san also maintained it himself, so I'll merge them into > the salsa repository. > > https://bitbucket.org/tsuchm/pkg-sentencepiece/src/master/