Hi,

On Tue, Aug 11, 2009 at 22:19, Mark Bennett <mbenn...@ideaeng.com> wrote:

Carrot2 has several pluggable algorithms to choose from, though I have no
> evidence that they're "better" than Lucene's.  Where TF/IDF is sort of a
> one
> step algebraic calculation, some clustering algorithms use iterative
> approaches, etc.


I'm not sure if I completely follow the way in which you'd like to use
Carrot2 for scoring -- would you cluster the whole index? Carrot2 was
designed to be a post-retrieval clustering algorithm and optimized to
cluster small sets of documents (up to ~1000) in real time. All processing
is performed in-memory, which limits Carrot2's applicability to really large
sets of documents.

S.

Reply via email to