Thanks, I'm new to the clustering libraries. I finally made this connection when I started browsing through the carrot2 source. I had pulled down a smaller MM document collection from our test environment. It was not ideal as it was mostly structured, but small. I foolishly thought I could cluster on the text copy field before realizing that it was index only. Doh! Our documents are indexed in SolrCloud, but stored in HBase. I want to allow users to page through Solr hits, but would like to cluster on all (or at least several thousand) of the top search hits. Now I'm puzzling over how to efficiently cluster over possibly several thousand Solr hits when the documents are in HBase. I thought an HBase coprocessor, but carrot2 isn't designed for distributed computation. Mahout, in the Hadoop M/R context, seems slow and heavy handed for this scale; maybe, I just need to dig deeper into their library. Or I could just be missing something fundamental? :) -----Original Message----- From: "Stanislaw Osinski" <stanislaw.osin...@carrotsearch.com> Sent: Friday, October 18, 2013 5:04am To: solr-user@lucene.apache.org Subject: Re: solrconfig.xml carrot2 params
Hi, Out of curiosity -- what would you like to achieve by changing Tokenizer.documentFields? If you want to have clustering applied to more than one document field, you can provide a comma-separated list of fields in the carrot.title and/or carrot.snippet parameters. Thanks, Staszek -- Stanislaw Osinski, stanislaw.osin...@carrotsearch.com http://carrotsearch.com On Thu, Oct 17, 2013 at 11:49 PM, youknow...@heroicefforts.net < youknow...@heroicefforts.net> wrote: > Would someone help me out with the syntax for setting > Tokenizer.documentFields in the ClusteringComponent engine definition in > solrconfig.xml? Carrot2 is expecting a Collection of Strings. There's no > schema definition for this XML file and a big TODO on the Wiki wrt init > params. Every permutation I have tried results in an error stating: > Cannot set java.until.Collection field ... to java.lang.String. > -- > Sent from my Android phone with K-9 Mail. Please excuse my brevity.