First, you'll get a lot of insight by defining something simply and looking
at the analysis page from solr admin. That's a very valuable page.

To your question:
commongrams are "shingles" that work between stopwords and
other words. For instance, "this is some text" gets analyzed into
this, this_is, is, is_some, some text. Note that the stopwords
are the only things that get combined with the text after.

NGrams form on letters. It's too long to post the whole thing, but
the above phrase gets analyzed as
t, h, i, s, th, hi, is, i, s, is, s, o, m, e, so, om, me...... It splits a
single
token into grams whereas commongrams essentially combines tokens
when they're stopwords.

Have you looked at "shingles"? See:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ShingleFilterFactory
Best
Erick


On Thu, Feb 3, 2011 at 10:15 AM, openvictor Open <openvic...@gmail.com>wrote:

> Thank you, I will do that and hopefuly it will be handy !
>
> But can someone explain me difference between CommonGramFIlterFactory et
> NGramFilterFactory ? ( Maybe the solution is there)
>
> Thank you all,
> best regards
>
> 2011/2/3 Grijesh <pintu.grij...@gmail.com>
>
> >
> > Use analysis.jsp to see what happening at index time and query time with
> > your
> > input data.You can use highlighting to see if match found.
> >
> > -----
> > Thanx:
> > Grijesh
> > http://lucidimagination.com
> > --
> > View this message in context:
> >
> http://lucene.472066.n3.nabble.com/Using-terms-and-N-gram-tp2410938p2411244.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
>

Reply via email to