Re: SOLR-769 clustering

2009-09-09 Thread Wang Guangchen
hi Staszek, Thank you very much for your advice. My problem has been solved. It is caused by the regexp in the stoplables.en. I didn't released that regular expression is required in order to filter out the words. I have add in the regexp in my stoplabels.en and it works like a charm. -GC On Wed

Re: SOLR-769 clustering

2009-09-08 Thread Stanislaw Osinski
Hi, It seems like the problem can be on two layers: 1) getting the right contents of stop* files for Carrot2, 2) making sure Solr picks up the changes. I tried your quick and dirty hack too. It didn't work also. phase like > "Carbon Atoms in the Group" with "in" still appear in my clustering labe

Re: SOLR-769 clustering

2009-09-08 Thread Wang Guangchen
Hi Staszek, I tried your quick and dirty hack too. It didn't work also. phase like "Carbon Atoms in the Group" with "in" still appear in my clustering labels. What i did is, 1. use "java uf carrot2-mini.jar stoplabels.en" command to replace the stoplabel.en file. 2. apply clustering patch. re-co

Re: SOLR-769 clustering

2009-09-08 Thread Wang Guangchen
On Tue, Sep 8, 2009 at 9:56 PM, Grant Ingersoll wrote: > > On Sep 8, 2009, at 5:11 AM, Wang Guangchen wrote: > > Hi Staszek, >> >> I try to apply the stoplabels with the instructions that you given in the >> solr clustering Wiki. But it didn't work. >> >> I am runing the patched solr on tomcat.

Re: SOLR-769 clustering

2009-09-08 Thread Stanislaw Osinski
Hi there, I try to apply the stoplabels with the instructions that you given in the > solr clustering Wiki. But it didn't work. > > I am runing the patched solr on tomcat. So to enable the stop label. I add > "-cp " in to my system's CATALINA_OPTS. I > tried to change the file name from stoplabels

Re: SOLR-769 clustering

2009-09-08 Thread Grant Ingersoll
On Sep 8, 2009, at 5:11 AM, Wang Guangchen wrote: Hi Staszek, I try to apply the stoplabels with the instructions that you given in the solr clustering Wiki. But it didn't work. I am runing the patched solr on tomcat. So to enable the stop label. I add "-cp " in to my system's CATALINA

Re: SOLR-769 clustering

2009-09-08 Thread Wang Guangchen
Hi Staszek, I try to apply the stoplabels with the instructions that you given in the solr clustering Wiki. But it didn't work. I am runing the patched solr on tomcat. So to enable the stop label. I add "-cp " in to my system's CATALINA_OPTS. I tried to change the file name from stoplabels.txt to

Re: SOLR-769 clustering

2009-04-24 Thread Stanislaw Osinski
> > How would we enable people via SOLR-769 to do this? Good point, Grant! To apply the modified stopwords.* and stoplabels.* files to Solr, simply make them available in the classpath. For the example Solr runner scripts that would be something like: java -cp -Dsolr.solr.home=./clustering/solr

Re: SOLR-769 clustering

2009-04-22 Thread Grant Ingersoll
On Apr 22, 2009, at 5:03 PM, Stanislaw Osinski wrote: Hi Antonio, So I think the way to go would be to tune the clustering algorithm's stop words / stop label dictionaries to exclude the labels you don't like. I can't guarantee you can get decent clusters with this technique, but it's

Re: SOLR-769 clustering

2009-04-22 Thread Stanislaw Osinski
Hi Antonio, > To answer your question in terms of minimum term is, I am working with > "joke text" very short in length so the clusters are not so meaning full.. I > mean lot of adverbs and nouns, I thought increasing it might give me less > cluster but bit more meaningful (maybe not). Clusteri

Re: SOLR-769 clustering

2009-04-22 Thread Antonio Eggberg
(maybe not). --- Den ons 2009-04-22 skrev Grant Ingersoll : > Från: Grant Ingersoll > Ämne: Re: SOLR-769 clustering > Till: solr-user@lucene.apache.org > Datum: onsdag 22 april 2009 14.44 > > On Apr 21, 2009, at 3:46 AM, Antonio Eggberg wrote: > > > > > Hello:

Re: SOLR-769 clustering

2009-04-22 Thread Grant Ingersoll
On Apr 21, 2009, at 3:46 AM, Antonio Eggberg wrote: Hello: I have got the clustering working i.e SOLR-769. I am wondering - why there is a filed called "body", does it have special purpose? multiValued="true"/> That's just used in the test schema and there isn't any need for you to

Re: SOLR-769 clustering

2009-04-21 Thread Stanislaw Osinski
Hi Antonio, - is there anyway to have minimum number of labels per cluster? The current search results clustering algorithms (from Carrot2) by design generate one label per cluster, so there is no way to force them to create more. What is the reason you'd like to have more labels per cluster? I