>
> Hmm, I saw the comment in ClusteringDocumentList.java of Carrot2:
>
> /*
> * If you know what query generated the documents you're about to cluster,
> pass
> * the query to the algorithm, which will usually increase clustering
> quality.
> */
> attributes.put(AttributeNames.QUERY, "data mining");
>
> So I'm worried about clustering quality when Carrot2 got string
> "MatchAllDocsQuery".


The query is just a hint, without the query you should still be able to get
decent clusters (at least for English, we've not tested Carrot2 much with
Japanese).

Cheers,

Staszek

Reply via email to