Thanks for your advice. I've indexed more content in and it's working
better now. Not all the index will be returned everytime now.

However, I found that the longer documents will tend to have a higher score
than those shorter documents, even though the shorter documents is suppose
to have a better match (more similar) to the query than the longer
documents. Is it because of words like "and", "the", etc that causes the
score of the longer documents to increase?

Is there anyway to configure this so that I can get the shorter documents
to have a higher score if they are of better match, or is it just more
indexes will solve this problem?


Regards,
Edwin



On 14 July 2015 at 15:40, Upayavira <u...@odoko.co.uk> wrote:

> Look at your "interesting terms". If your index is too small, it will
> consider words like "and", "the", etc to be "interesting" and form a
> part of the query, thus returning your entire index, which doesn't help.
>
> Effectively what MLT does is attempt to pick the 25 (configurable) best
> terms in the source document and forms a Lucene query based upon them.
> It takes the frequency of the terms in your index and in the document
> into account when scoring the terms (much like TF/IDF). For this to
> really work, you need a reasonable amount of content.
>
> Upayavira
>
> On Tue, Jul 14, 2015, at 07:40 AM, Zheng Lin Edwin Yeo wrote:
> > Hi,
> >
> > I'm using Solr 5.2.1 and I'm trying to implement MoreLikeThis feature in
> > Solr.
> >
> > But the results that I've been getting for the MoreLikeThis has not been
> > accurate so far. I've been getting the entire documents in the collection
> > returned in the "response" section even though the documents has no
> > similar
> > match to my query.
> >
> > For example, if I have 10 records in the collections, 1 will be under the
> > "match" section, while the other 9 will be under the "response" section,
> > even though there's only 1 or 2 that's related to the one under the
> > "match"
> > section.
> >
> > Below is my configuration in solrconfig.xml:
> >
> > <requestHandler name="/mlt" class="solr.MoreLikeThisHandler" >
> > <lst name="defaults">
> > <str name="echoParams">explicit</str>
> > <str name="wt">json</str>
> > <str name="indent">true</str>
> >  <str name="defType">edismax</str>
> > <str name="fl">id, score</str>
> > <str name="mlt.qf">
> >  Objective^20.0 Summary^10.0
> > </str>
> >
> > <str name="df">Summary</str>
> > <str name="mlt.fl">Objective,Summary</str>
> > <str name="mlt.mintf">2</str>
> >                         <str name="mlt.mindf">5</str>
> > <str name="mlt.maxqt">10</str>
> > <str name="mlt.count">10</str>
> > <str name="mlt.boost">true</str>
> > <str name="mlt.interestingTerms">details</str>
> > </lst>
> > </requestHandler>
> >
> >
> > Regards,
> > Edwin
>

Reply via email to