Re: Clustering from anlayzed text instead of raw input

2010-03-05 Thread Stanislaw Osinski
> I'll give a try to stopwords treatbment, but the problem is that we > perform > POS tagging and then use payloads to keep only Nouns and Adjectives, and we > thought that could be interesting to perform clustering only with these > elements, to avoid senseless words. > POS tagging could help a

Re: Clustering from anlayzed text instead of raw input

2010-03-03 Thread JCodina
uster > labels are phrases taken from the input text, if you remove stopwords and > stem everything, the phrases will become unreadable). > > Cheers, > > Staszek > > -- View this message in context: http://old.nabble.com/Clustering-from-anlayzed-text-instead-of-raw-input-tp27765780p27769034.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Clustering from anlayzed text instead of raw input

2010-03-03 Thread Stanislaw Osinski
Hi Joan, I'm trying to use carrot2 (now I started with the workbench) and I can > cluster any field, but, the text used for clustering is the original raw > text, the one that was indexed, without any of the processing performed by > the tokenizer or filters. > So I get stop words. > The easiest

Clustering from anlayzed text instead of raw input

2010-03-03 Thread JCodina
this message in context: http://old.nabble.com/Clustering-from-anlayzed-text-instead-of-raw-input-tp27765780p27765780.html Sent from the Solr - User mailing list archive at Nabble.com.