Package: wnpp Severity: wishlist Owner: proycon <proy...@anaproy.nl> * Package name : colibri-core Version : 2.1.3 Upstream Author : Maarten van Gompel <proy...@anaproy.nl> * URL : https://proycon.github.io/colibri-core/ * License : GPL-3 Programming Lang: C++, Python Description : Colibri Core is a Natural Language Processing tool and library to quickly and efficiently count and extract patterns from large corpus data.
Colibri Core is software consisting of command line tools as well as programming libraries for C++ and Python to quickly and efficiently count and extract patterns from large corpus data, to extract various statistics on the extracted patterns, and to compute relations between the extracted patterns. The employed notion of pattern or construction encompasses ngrams, skipgrams, and flexgrams. Though, n-gram extraction may seem fairly trivial at first, simple approachs place an unnecessarily high demand on memory resources, this often becomes prohibitive if unleashed on large corpora. Colibri Core tries to minimise these time & space requirements in several ways, and provides a foundation for other tools to build on. The package is to be maintained in the Debian Science packaging team. Hopefully sponsored by Joost van Baal-Ilić? Extra help always welcome. -- Maarten van Gompel Centre for Language Studies Radboud Universiteit Nijmegen proy...@anaproy.nl http://proycon.anaproy.nl http://github.com/proycon GnuPG key: 0x1A31555C XMPP: proy...@anaproy.nl Telegram: proycon IRC: proycon (freenode) Twitter: https://twitter.com/proycon Bitcoin: 1BRptZsKQtqRGSZ5qKbX2azbfiygHxJPsd
signature.asc
Description: signature