Re: Antw: Re: Behaviour of punctuation marks in phrase queries

2019-05-17 Thread Michael Gibney
After further reflection, I think that upgrading to 8.1 (LUCENE-8730) would actually not help in this case. It doesn't matter whether "a.b." or "ab" would be indexed or evaluated first; they'd both have implied positionLength 1 (as read from the index at query time), and would both be evaluated bef

Re: Antw: Re: Behaviour of punctuation marks in phrase queries

2019-05-17 Thread Michael Gibney
The SpanNearQuery in association with "a.b." input and WDGF is expected behavior, since WDGF causes the query to search ("ab")|("a" "b"), as 1 or 2 tokens, respectively. The "a. b." input (whitespace-separated) is tokenized simply as "a" "b" (2 tokens) so sticks with the more straightforward Phrase

Re: Antw: Re: Behaviour of punctuation marks in phrase queries

2019-05-17 Thread Erick Erickson
I’ll leave that explanation to someone who understands query parsers ;) > On May 17, 2019, at 7:57 AM, Doris Peter wrote: > > Thanks a lot! I tried the debug parameter, which shows interesting > differences: > > debug": { > >"rawquerystring": "all_places_txt:\"Neuburg a. d. Donau\"", >

Antw: Re: Behaviour of punctuation marks in phrase queries

2019-05-17 Thread Doris Peter
Thanks a lot! I tried the debug parameter, which shows interesting differences: debug": { "rawquerystring": "all_places_txt:\"Neuburg a. d. Donau\"", "querystring": "all_places_txt:\"Neuburg a. d. Donau\"", "parsedquery": "PhraseQuery(all_places_txt:\"neuburg a d donau\")", "parse

Re: Behaviour of punctuation marks in phrase queries

2019-05-17 Thread Erick Erickson
Three things: 1> WordDelimiterGraphFilterFactory requires FlattenGraphFilterFactory after it in the index config 2> It is usually unnecessary to have the exact same parameters at both query and index time for WDGFF. If you’ve split parts up at index time then mashed them all back together, you