Hi!

I am having a weird issue with a search string not producing a match
where it should. I can reproduce it with both 3.4 and 3.5.

"Where it should" means that I am getting a hit in the "Analyse" tool
in the admin panel, but not in a query via /select.

Now when I try

   select?q=Am+Heidstamm&...

I get zero results back. But, when I quote the string

  select?q=%22Am+Heidstamm%22&...

I get several hits.

BTW, the token "am" is filtered out in the field text, since it's in a
stopword list.

Any ideas on how this can b explained?

My defaultSearchField ist "text". The field gets its content via
several copyField statements.

The configuration for text is as follows:

   <field name="text" type="text_de" indexed="true" stored="false"
multiValued="true" />

The configuration for type text_de is this:

    <fieldType name="text_de" class="solr.TextField" positionIncrementGap="100">
                <analyzer>
                        <!-- protect slashes from tokenizer by replacing with 
something unique -->
                        <charFilter class="solr.PatternReplaceCharFilterFactory"
                                pattern="([A-Z]+)/([0-9]+)/([0-9]+)" 
replacement="$1ḧ$2ḧ$3" />
                        <charFilter class="solr.PatternReplaceCharFilterFactory"
                                pattern="([0-9]+)/([0-9]+)" replacement="$1ḧ$2" 
/>
                        <!-- protect paragraph symbol from tokenizer -->
                        <charFilter class="solr.PatternReplaceCharFilterFactory"
                                pattern="§\s*([0-9]+)" replacement="ǚ$1" />
                        <tokenizer class="solr.StandardTokenizerFactory"/>
                        <filter class="solr.WordDelimiterFilterFactory"
                                generateWordParts="1" generateNumberParts="1" 
catenateWords="1"
                                catenateNumbers="1" catenateAll="1" 
preserveOriginal="1"
splitOnCaseChange="1"/>
                        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords_de.txt" enablePositionIncrements="true" />
                        <filter class="solr.LowerCaseFilterFactory" />
                        <filter class="solr.GermanMinimalStemFilterFactory" />
                        <!-- get slashes back in -->
                        <filter class="solr.PatternReplaceFilterFactory" 
pattern="ḧ"
replacement="/" />
                        <!-- get paragraph symbols back in -->
                        <filter class="solr.PatternReplaceFilterFactory" 
pattern="ǚ"
replacement="§" />
        </analyzer>
    </fieldType>


Log output for the unquoted phrase:

INFO: [] webapp=/solr path=/select
params={facet=true&sort=score+desc&fl=sitzung,gremium,betreff,datum,timestamp,score,aktenzeichen,typ,id,anhang&debugQuery=true&start=0&q=Am+Heidstamm&hl.fl=betreff&wt=json&fq=&hl=true&rows=10}
hits=0 status=0 QTime=29

... and for the quoted one:

INFO: [] webapp=/solr path=/select
params={facet=true&sort=score+desc&fl=sitzung,gremium,betreff,datum,timestamp,score,aktenzeichen,typ,id,anhang&start=0&q="Am+Heidstamm"&hl.fl=betreff&wt=standard&fq=&hl=true&rows=10&version=2.2}
hits=14 status=0 QTime=244


Thanks!

Reply via email to