Thanks a lot! I tried the debug parameter, which shows interesting differences:
debug": { "rawquerystring": "all_places_txt:\"Neuburg a. d. Donau\"", "querystring": "all_places_txt:\"Neuburg a. d. Donau\"", "parsedquery": "PhraseQuery(all_places_txt:\"neuburg a d donau\")", "parsedquery_toString": "all_places_txt:\"neuburg a d donau\"", "QParser": "LuceneQParser" } debug": { "rawquerystring": "all_places_txt:\"Neuburg a.d. Donau\"", "querystring": "all_places_txt:\"Neuburg a.d. Donau\"", "parsedquery": "SpanNearQuery(spanNear([all_places_txt:neuburg, spanOr([all_places_txt:ad, spanNear([all_places_txt:a, all_places_txt:d], 0, true)]), all_places_txt:donau], 0, true))", "parsedquery_toString": "spanNear([all_places_txt:neuburg, spanOr([all_places_txt:ad, spanNear([all_places_txt:a, all_places_txt:d], 0, true)]), all_places_txt:donau], 0, true)", "QParser": "LuceneQParser" } Something seems to go wrong here, as the parsedquery contains the SpanNearQuery instead of a PhraseQuery. >>> Erick Erickson <erickerick...@gmail.com> 5/17/2019 4:27 PM >>> Three things: 1> WordDelimiterGraphFilterFactory requires FlattenGraphFilterFactory after it in the index config 2> It is usually unnecessary to have the exact same parameters at both query and index time for WDGFF. If you’ve split parts up at index time then mashed them all back together, you can usually only split them up at query time. 3> try adding &debug=query to the query and see what the results show for the parsed query. That usually gives you a clue what is really happening .vs. what you think is happening. Best, Erick > On May 17, 2019, at 12:59 AM, Doris Peter <doris.pe...@bsb-muenchen.de> wrote: > > Hello, > > We use Solr 7.6.0 to build our index, and I have got a Question about > Phrase Queries: > > We use the following configuration in schema.xml: > > <!-- Text Standard --> > <fieldType name="text" class="solr.TextField" > positionIncrementGap="1000" sortMissingLast="true" > autoGeneratePhraseQueries="true"> > <analyzer type="index"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <charFilter class="solr.MappingCharFilterFactory" > mapping="mapping-FoldToASCII.txt"/> > <filter class="solr.CJKBigramFilterFactory"/> > <filter class="solr.WordDelimiterGraphFilterFactory" > protected="protectedword.txt" > preserveOriginal="0" splitOnNumerics="1" > splitOnCaseChange="0" > catenateWords="1" catenateNumbers="1" catenateAll="1" > generateWordParts="1" generateNumberParts="1" > stemEnglishPossessive="1" > types="wdfftypes.txt" /> > <filter class="solr.LengthFilterFactory" min="1" > max="2147483647"/> > <filter class="solr.LowerCaseFilterFactory"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <charFilter class="solr.MappingCharFilterFactory" > mapping="mapping-FoldToASCII.txt"/> > <filter class="solr.CJKBigramFilterFactory"/> > <filter class="solr.WordDelimiterGraphFilterFactory" > protected="protectedword.txt" > preserveOriginal="0" splitOnNumerics="1" > splitOnCaseChange="0" > catenateWords="1" catenateNumbers="1" catenateAll="1" > generateWordParts="1" generateNumberParts="1" > stemEnglishPossessive="1" > types="wdfftypes.txt" /> > <filter class="solr.LengthFilterFactory" min="1" > max="2147483647"/> > <filter class="solr.LowerCaseFilterFactory"/> > </analyzer> > </fieldType> > > > If we search for a phrase like "Moosburg a.d. Isar" we don't get a > match, though it's definitely in our Index. > If we search for "Moosburg a. d. Isar" with a blank between "a." > and "d." we get a match. > > This also happens for other non-word characters, like ' or , for > example. > > The strange thing about it is, that the Solr Analysis-Tool reports > a match for the first version, but when we send a Solr Query, we get no > result Documents. > > Has anyone got an idea, what this could be? > > Thank you very much in advance, > > Doris Peter