Re: Scoring partial match in title field higher than exact match in description field

Walter Underwood Tue, 31 Mar 2020 09:10:44 -0700

1. Do not use stopwords. Ever. Especially with names.
2. Do not use stemming with names. “Bill Gates” is not the same as “Bill Gate”. 
You aren’t, but I thought I’d include that.
3. Mixing partail word matches and full word matches in the same search is 
likely to give odd results. 
4. Your requirements don’t say anything about phrase matches, which are very 
important.
5. What synonmys are you using, are they tuned for names, like “william, bill”? 
If not, remove that.
6. Why are you using WordDelimiterGraphFilterFactory? Do you want to split 
“DeForest” into “De Forest”?
7. What are you using payloads for?


You can use weights in the edismax parser to give different fields different 
importance in matching. 
I would also put weights on phrase matches. I usually double the weights for 
phrases because the 
native weighting for phrases in Solr isn’t enough.

Something like this:

<str name=“qf”>name^4 name_ngram^2 infotext</str>
<str name=“pf”>name^8 name_ngram^4 infotext^2</str>

Get rid of:

* StopFilterFactory
* SynonymFilterFactory
* WordDelimiterFilterFactory

With the remaining filters, you’ll never have duplicates, so you can also get 
rid of RemoveDupliicatsTokenFilterFactory if you want.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Mar 31, 2020, at 6:00 AM, nileshwagh24 <waghniles...@gmail.com> wrote:
> 
> In need to score partial match in NAME field higher than exact match in
> INFOTEXT field.Actually I need to sort my SOLR result based on following
> five conditions
> 
> 1.First, results with a whole word match on the first or second word in the
> NAME go on top.
> 
> 2.Then, results with a whole word match elsewhere in the NAME.
> 
> 3.Then, results with a partial word match anywhere in the NAME.
> 
> 4.Then, results with a whole word match in the INFOTEXT.
> 
> 5.Finally, results with a partial word match in the INFOTEXT.
> 
> I have added descriptionExact field for INFOTEXT and titleExact field for
> NAME.
> 
> For this I have added four fields like
> titleExact,titlePartial,descriptionExact,DescriptionPartial and boosted
> titleExact score higher than all others.
> 
> But I am not getting result as expected.
> 
> Following are the field definitions for exact and partial field
> 
> 
> <fieldType name="text_general" class="solr.TextField"
> positionIncrementGap="100" multiValued="true">
>    <analyzer type="index">
>      <charFilter class="solr.MappingCharFilterFactory"
> mapping="mapping-ISOLatin1Accent.txt"/>
>      <tokenizer class="solr.KeywordTokenizerFactory"/>
>      <filter class="solr.StopFilterFactory" words="stopwords.txt"
> ignoreCase="true"/>
>      <filter class="solr.LowerCaseFilterFactory"/>
>      <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>    </analyzer>
>    <analyzer type="query">
>      <charFilter class="solr.MappingCharFilterFactory"
> mapping="mapping-ISOLatin1Accent.txt"/>
>      <tokenizer class="solr.KeywordTokenizerFactory"/>
>      <filter class="solr.SynonymFilterFactory" expand="true"
> ignoreCase="true" synonyms="synonyms.txt"/>
>      <filter class="solr.StopFilterFactory" words="stopwords.txt"
> ignoreCase="true"/>
>      <filter class="solr.LowerCaseFilterFactory"/>
>      <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>    </analyzer>
>  </fieldType>
>  <fieldType name="text_general_partial" class="solr.TextField"
> positionIncrementGap="100" multiValued="true">
>    <analyzer type="index">
>      <charFilter class="solr.MappingCharFilterFactory"
> mapping="mapping-ISOLatin1Accent.txt"/>
>      <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> <filter class="solr.DelimitedPayloadTokenFilterFactory" delimiter="$"
> encoder="float"/>
> <filter class="solr.SynonymFilterFactory" expand="true" ignoreCase="true"
> synonyms="synonyms.txt"/>
>      <filter class="solr.StopFilterFactory" words="stopwords.txt"
> ignoreCase="true"/>
>      <filter class="solr.WordDelimiterGraphFilterFactory"
> catenateNumbers="1" generateNumberParts="1" protected="protwords.txt"
> splitOnCaseChange="1" generateWordParts="0" preserveOriginal="1"
> catenateAll="0" catenateWords="1"/>
> <filter class="solr.EdgeNGramFilterFactory" maxGramSize="15"
> minGramSize="3"/>
>      <filter class="solr.LowerCaseFilterFactory"/>
>      <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>    </analyzer>
>    <analyzer type="query">
>      <charFilter class="solr.MappingCharFilterFactory"
> mapping="mapping-ISOLatin1Accent.txt"/>
>      <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> <filter class="solr.DelimitedPayloadTokenFilterFactory" delimiter="$"
> encoder="float"/>
>      <filter class="solr.SynonymFilterFactory" expand="true"
> ignoreCase="true" synonyms="synonyms.txt"/>
>      <filter class="solr.StopFilterFactory" words="stopwords.txt"
> ignoreCase="true"/>
>      <filter class="solr.WordDelimiterGraphFilterFactory"
> catenateNumbers="1" generateNumberParts="1" protected="protwords.txt"
> splitOnCaseChange="1" generateWordParts="0" preserveOriginal="1"
> catenateAll="0" catenateWords="1"/>
>      <filter class="solr.LowerCaseFilterFactory"/>
>      <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>    </analyzer>
>  </fieldType>
> 
> Any help will be appreciated
> 
> 
> 
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Scoring partial match in title field higher than exact match in description field

Reply via email to