A tiny but really explanation can be found here http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
2008/8/18 finy finy <[EMAIL PROTECTED]> > thanks for your help. > > could you give me your gmail talk address or msn? > > > 2008/8/19, Norberto Meijome <[EMAIL PROTECTED]>: > > > > On Mon, 18 Aug 2008 23:07:19 +0800 > > "finy finy" <[EMAIL PROTECTED]> wrote: > > > > > because i use chinese character, for example "ibm_______________" > > > solr will parse it into a term "ibm" and a phraze "_________ ______" > > > can i use solr to query with a term "ibm" and a term "_________" and a > > term "______"? > > > > Hi finy, > > you should look into n-gram tokenizers. Not sure if it is documented in > the > > wiki, but it has been discussed in the mailing list quite a few times. > > > > in short, an n-gram tokenizer breaks your input into blocks of characters > > of size n , which are then used to compare in the index. I think for > Chinese > > , bi-gram is the favoured approach. > > > > good luck, > > B > > _________________________ > > {Beto|Norberto|Numard} Meijome > > > > I used to hate weddings; all the Grandmas would poke me and > > say, "You're next sonny!" They stopped doing that when i > > started to do it to them at funerals. > > > > I speak for myself, not my employer. Contents may be hot. Slippery when > > wet. Reading disclaimers makes you go blind. Writing them is worse. You > have > > been Warned. > > > -- Alexander Ramos Jardim