subject:"Re\: Solr\/Lucene Tokenizers \- cannot get the behavior I need"

Re: Solr/Lucene Tokenizers - cannot get the behavior I need

2012-11-17 Thread Shawn Heisey

On 11/16/2012 12:30 PM, Shawn Heisey wrote: I am extremely interested in the Unicode behavior of ICUTokenizer, but I cannot disable the punctuation-splitting behavior and let WDF handle it properly, which causes recall problems. There is no filter that I can run after tokenization, either. Lo

Re: Solr/Lucene Tokenizers - cannot get the behavior I need

2012-11-17 Thread Shawn Heisey

On 11/16/2012 12:52 PM, Shawn Heisey wrote: On 11/16/2012 12:36 PM, Jack Krupansky wrote: Generally, you don't need the preserveOriginal attribute for WDF. Generate both the word parts and the concatenated terms, and queries should work fine without the original. The separated terms will be in

Re: Solr/Lucene Tokenizers - cannot get the behavior I need

2012-11-16 Thread Shawn Heisey

On 11/16/2012 12:36 PM, Jack Krupansky wrote: Generally, you don't need the preserveOriginal attribute for WDF. Generate both the word parts and the concatenated terms, and queries should work fine without the original. The separated terms will be indexed as a sequence, and the split/separated

Re: Solr/Lucene Tokenizers - cannot get the behavior I need

2012-11-16 Thread Jack Krupansky

Generally, you don't need the preserveOriginal attribute for WDF. Generate both the word parts and the concatenated terms, and queries should work fine without the original. The separated terms will be indexed as a sequence, and the split/separated terms will generate a phrase query that matches

Re: Solr/Lucene Tokenizers - cannot get the behavior I need

Re: Solr/Lucene Tokenizers - cannot get the behavior I need

Re: Solr/Lucene Tokenizers - cannot get the behavior I need

Re: Solr/Lucene Tokenizers - cannot get the behavior I need

4 matches

Site Navigation

Mail list logo

Footer information