Zheng Lin Edwin Yeo <edwinye...@gmail.com> wrote:
> However, I find that clustering is exceeding slow after I index this 1GB of
> data. It took almost 30 seconds to return the cluster results when I set it
> to cluster the top 1000 records, and still take more than 3 seconds when I
> set it to cluster the top 100 records.

Your clustering uses Carrot2, which fetches the top documents and performs 
real-time clustering on them - that process is (nearly) independent of index 
size. The relevant numbers here are top 1000 and top 100, not 1GB. The unknown 
part is whether it is the fetching of top 1000 (the Solr part) or the 
clustering itself (the Carrot part) that is the bottleneck.

- Toke Eskildsen

Reply via email to