Re: Word Delimiter issue

2012-07-31 Thread Michael Della Bitta
It looks like the fact that this duplicate token is generated by WordDelimiter after StopFilter means that it's not filtered out. In any case, a search on "david david" against this field does find documents with values like "David's" as well as "David, David, David..." Michael Della Bitta -

Re: Word Delimiter issue

2012-07-31 Thread Michael Della Bitta
Yes, that had occurred to me too, but I wasn't exposed to the original query from the developer who was having the trouble, just the text and strange analysis. I'll confer with him to make sure there's actually something to work on here. Michael Della Bitta ---

Re: Word Delimiter issue

2012-07-31 Thread Jack Krupansky
I agree that it would make more sense for the catenated word ("johnsons") to be at the same position as the leading word ("johnson"). But, what are some example queries that would "fail" given this behavior? "johnson and johnson" would not falsely match since you have position increment enable