This was a common one when I was matching movie and song names. If
that is your project, also try boosting if it's the first word or on
shorter titles. Also try bigrams of stopwords: "Call of the Wild"
becomes "call", "of-the", "wild".

The bigrams trick is also good if you have people block-copying large
chunks of boilerplate for finding official documents.

On Fri, Apr 13, 2012 at 2:04 AM, Kissue Kissue <kissue...@gmail.com> wrote:
> Thanks a lot. I had already implemented Walter's solution and was wondering
> if this was the right way to deal with it. This has now given me the
> confidence to go with the solution.
>
> Many thanks.
>
> On Fri, Apr 13, 2012 at 1:04 AM, Erick Erickson 
> <erickerick...@gmail.com>wrote:
>
>> GAH! I had my head in "make this happen in one field" when I wrote my
>> response, without being explicit. Of course Walter's solution is pretty
>> much the standard way to deal with this.
>>
>> Best
>> Erick
>>
>> On Thu, Apr 12, 2012 at 5:38 PM, Walter Underwood <wun...@wunderwood.org>
>> wrote:
>> > It is easy. Create two fields, text_exact and text_stem. Don't use the
>> stemmer in the first chain, do use the stemmer in the second. Give the
>> text_exact a bigger weight than text_stem.
>> >
>> > wunder
>> >
>> > On Apr 12, 2012, at 4:34 PM, Erick Erickson wrote:
>> >
>> >> No, I don't think there's an OOB way to make this happen. It's
>> >> a recurring theme, "make exact matches score higher than
>> >> stemmed matches".
>> >>
>> >> Best
>> >> Erick
>> >>
>> >> On Thu, Apr 12, 2012 at 5:18 AM, Kissue Kissue <kissue...@gmail.com>
>> wrote:
>> >>> Hi,
>> >>>
>> >>> I have a field in my index called itemDesc which i am applying
>> >>> EnglishMinimalStemFilterFactory to. So if i index a value to this field
>> >>> containing "Edges", the EnglishMinimalStemFilterFactory applies
>> stemming
>> >>> and "Edges" becomes "Edge". Now when i search for "Edges", documents
>> with
>> >>> "Edge" score better than documents with the actual search word -
>> "Edges".
>> >>> Is there a way i can make documents with the actual search word in this
>> >>> case "Edges" score better than document with "Edge"?
>> >>>
>> >>> I am using Solr 3.5. My field definition is shown below:
>> >>>
>> >>> <fieldType name="text_en" class="solr.TextField"
>> positionIncrementGap="100">
>> >>>      <analyzer type="index">
>> >>>        <tokenizer class="solr.StandardTokenizerFactory"/>
>> >>>               <filter class="solr.SynonymFilterFactory"
>> >>> synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
>> >>>             <filter class="solr.StopFilterFactory"
>> >>>                ignoreCase="true"
>> >>>                words="stopwords_en.txt"
>> >>>                enablePositionIncrements="true"
>> >>>             <filter class="solr.LowerCaseFilterFactory"/>
>> >>>    <filter class="solr.EnglishPossessiveFilterFactory"/>
>> >>>        <filter class="solr.EnglishMinimalStemFilterFactory"/>
>> >>>      </analyzer>
>> >>>      <analyzer type="query">
>> >>>        <tokenizer class="solr.StandardTokenizerFactory"/>
>> >>>        <filter class="solr.SynonymFilterFactory"
>> synonyms="synonyms.txt"
>> >>> ignoreCase="true" expand="true"/>
>> >>>        <filter class="solr.StopFilterFactory"
>> >>>                ignoreCase="true"
>> >>>                words="stopwords_en.txt"
>> >>>                enablePositionIncrements="true"
>> >>>                />
>> >>>        <filter class="solr.LowerCaseFilterFactory"/>
>> >>>    <filter class="solr.EnglishPossessiveFilterFactory"/>
>> >>>        <filter class="solr.KeywordMarkerFilterFactory"
>> >>> protected="protwords.txt"/>
>> >>>        <filter class="solr.EnglishMinimalStemFilterFactory"/>
>> >>>      </analyzer>
>> >>>    </fieldType>
>> >>>
>> >>> Thanks.
>> >
>> >
>> >
>> >
>> >
>>



-- 
Lance Norskog
goks...@gmail.com

Reply via email to