Re: Clustering from anlayzed text instead of raw input

2010-03-05 Thread Stanislaw Osinski
> I'll give a try to stopwords treatbment, but the problem is that we > perform > POS tagging and then use payloads to keep only Nouns and Adjectives, and we > thought that could be interesting to perform clustering only with these > elements, to avoid senseless words. > POS tagging could help a

Re: Clustering from anlayzed text instead of raw input

2010-03-03 Thread JCodina
Thanks Staszek I'll give a try to stopwords treatbment, but the problem is that we perform POS tagging and then use payloads to keep only Nouns and Adjectives, and we thought that could be interesting to perform clustering only with these elements, to avoid senseless words. Of course is a proble

Re: Clustering from anlayzed text instead of raw input

2010-03-03 Thread Stanislaw Osinski
Hi Joan, I'm trying to use carrot2 (now I started with the workbench) and I can > cluster any field, but, the text used for clustering is the original raw > text, the one that was indexed, without any of the processing performed by > the tokenizer or filters. > So I get stop words. > The easiest