A tiny but really explanation can be found here
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

2008/8/18 finy finy <[EMAIL PROTECTED]>

> thanks for your help.
>
> could you give me your gmail talk address or msn?
>
>
> 2008/8/19, Norberto Meijome <[EMAIL PROTECTED]>:
> >
> > On Mon, 18 Aug 2008 23:07:19 +0800
> > "finy finy" <[EMAIL PROTECTED]> wrote:
> >
> > > because i use chinese character, for example "ibm_______________"
> > > solr will parse it into a term "ibm" and a phraze "_________ ______"
> > > can i use solr to query with a term "ibm" and a term "_________"  and a
> > term "______"?
> >
> > Hi finy,
> > you should look into n-gram tokenizers. Not sure if it is documented in
> the
> > wiki, but it has been discussed in the mailing list quite a few times.
> >
> > in short, an n-gram tokenizer breaks your input into blocks of characters
> > of size n , which are then used to compare in the index. I think for
> Chinese
> > , bi-gram is the favoured approach.
> >
> > good luck,
> > B
> > _________________________
> > {Beto|Norberto|Numard} Meijome
> >
> > I used to hate weddings; all the Grandmas would poke me and
> > say, "You're next sonny!" They stopped doing that when i
> > started to do it to them at funerals.
> >
> > I speak for myself, not my employer. Contents may be hot. Slippery when
> > wet. Reading disclaimers makes you go blind. Writing them is worse. You
> have
> > been Warned.
> >
>



-- 
Alexander Ramos Jardim

Reply via email to