Re: questions about Clustering

Grant Ingersoll Sat, 23 May 2009 04:22:16 -0700


On May 22, 2009, at 11:41 PM, Koji Sekiguchi wrote:

I'm thinking using clustering (SOLR-769) function for my project.

I have a couple of questions:

1. if q=*:* is requested, Carrot2 will receive "MatchAllDocsQuery"
via attributes. Is it OK?

Yes, it only clusters on the Doc List, not the Doc Set (in otherwords, it's your rows that matter)

2. I'd like to use it on an environment other than English, e.g.Japanese.

I've implemented Carrot2JapaneseAnalyzer (w/ Payload/ITokenType)
for this purpose.
It worked well with ClusteringDocumentList example, but didn't
work with CarrotClusteringEngine.

What I did is that I inserted the following lines(+) to
CarrotClusteringEngine:

attributes.put(AttributeNames.QUERY, query.toString());
+ attributes.put(AttributeUtils.getKey(Tokenizer.class, "analyzer"),
+ Carrot2JapaneseAnalyzer.class);

There is no runtime errors, but Carrot2 didn't use my analyzer,
it just ignored and used ExtendedWhitespaceAnalyzer (confirmed via
debugger).
Is it classloader problem? I placed my jar in ${solr.solr.home}/lib .

Hmmm, I'm not sure if the Carrot guys are on this list (they are ondev). Can you share a simple example on the JIRA issue and we candiscuss there?



--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)using Solr/Lucene:

http://www.lucidimagination.com/search

Re: questions about Clustering

Reply via email to