Re: Question on Exact Matches - edismax

Sandeep Mestry Thu, 04 Apr 2013 02:52:36 -0700

Hi Jan,

Thanks for your reply. I have defined string_ci like below:


<fieldType name="string_ci" class="solr.TextField" sortMissingLast="true"
omitNorms="true" compressThreshold="10">
            <analyzer>
                <tokenizer class="solr.KeywordTokenizerFactory"/>
                <filter class="solr.LowerCaseFilterFactory"/>
            </analyzer>
        </fieldType>

When I analyse the query in solr, I saw that document containing
pg_series_title_ci:"funny"  matches when I do a search for
pg_series_title_ci:"funny games" and is ranked higher than the document
containing the exact matches. I can use the default string data type but
then the match will be on exact casing.

Thanks,
Sandeep


On 3 April 2013 22:20, Jan Høydahl <jan....@cominvent.com> wrote:

> Can you show us your *_ci field type? Solr does not really have a way to
> tell whether a match is "exact" or only partial, but you could hack around
> it with the fieldType. See https://github.com/cominvent/exactmatch for a
> possible solution.
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> Solr Training - www.solrtraining.com
>
> 3. apr. 2013 kl. 15:55 skrev Sandeep Mestry <sanmes...@gmail.com>:
>
> > Hi All,
> >
> > I have a requirement where in exact matches for 2 fields (Series Title,
> > Title) should be ranked higher than the partial matches. The
> configuration
> > looks like below:
> >
> > <requestHandler name="assetdismax" class="solr.SearchHandler" >
> >        <lst name="defaults">
> >            <str name="defType">edismax</str>
> >            <str name="echoParams">explicit</str>
> >            <float name="tie">0.01</float>
> >            <str name="qf">*pg_series_title_ci*^500 *title_ci*^300 *
> > pg_series_title*^200 *title*^25 classifications^15
> classifications_texts^15
> > parent_classifications^10 synonym_classifications^5 pg_brand_title^5
> > pg_series_working_title^5 p_programme_title^5 p_item_title^5
> > p_interstitial_title^5 description^15 pg_series_description
> annotations^0.1
> > classification_notes^0.05 pv_program_version_number^2
> > pv_program_version_number_ci^2 pv_program_number^2 pv_program_number_ci^2
> > p_program_number^2 ma_version_number^2 ma_recording_location
> > ma_contributions^0.001 rel_pg_series_title rel_programme_title
> > rel_programme_number rel_programme_number_ci pg_uuid^0.5 p_uuid^0.5
> > pv_uuid^0.5 ma_uuid^0.5</str>
> >            <str name="pf">pg_series_title_ci^500 title_ci^500</str>
> >            <int name="ps">0</int>
> >            <str name="q.alt">*:*</str>
> >            <str name="mm">100%</str>
> >            <str name="q.op">AND</str>
> >            <str name="facet">true</str>
> >            <str name="facet.limit">-1</str>
> >            <str name="facet.mincount">1</str>
> >        </lst>
> >    </requestHandler>
> >
> > As you can see above, the search is against many fields. What I'd want is
> > the documents that have exact matches for series title and title fields
> > should rank higher than the rest.
> >
> > I have added 2 case insensitive (*pg_series_title_ci, title_ci*) fields
> for
> > series title and title and have boosted them higher over the tokenized
> and
> > rest of the fields. I have also implemented a similarity class to
> override
> > idf however I still get documents having partial matches in title and
> other
> > fields ranking higher than exact match in pg_series_title_ci.
> >
> > Many Thanks,
> > Sandeep
>
>

Re: Question on Exact Matches - edismax

Reply via email to