On 11/16/2012 12:30 PM, Shawn Heisey wrote:
I am extremely interested in the Unicode behavior of ICUTokenizer, but
I cannot disable the punctuation-splitting behavior and let WDF handle
it properly, which causes recall problems. There is no filter that I
can run after tokenization, either. Lo
On 11/16/2012 12:52 PM, Shawn Heisey wrote:
On 11/16/2012 12:36 PM, Jack Krupansky wrote:
Generally, you don't need the preserveOriginal attribute for WDF.
Generate both the word parts and the concatenated terms, and queries
should work fine without the original. The separated terms will be
in
On 11/16/2012 12:36 PM, Jack Krupansky wrote:
Generally, you don't need the preserveOriginal attribute for WDF.
Generate both the word parts and the concatenated terms, and queries
should work fine without the original. The separated terms will be
indexed as a sequence, and the split/separated
Generally, you don't need the preserveOriginal attribute for WDF. Generate
both the word parts and the concatenated terms, and queries should work fine
without the original. The separated terms will be indexed as a sequence, and
the split/separated terms will generate a phrase query that matches