Re: Need more info on MLT (More Like This) feature

Dave Fri, 13 Sep 2019 13:09:22 -0700

As a side note, if you use shingles with the mlt handler I believe you will get 
better scores/relevant results. So “to be free” becomes indexes as “to_be” 
“to_be_free” and “be_free” but also as each word. It makes the index 
significantly larger but creates better “unique terms” in my opinion and 
improved the results for me at least.


> On Sep 13, 2019, at 2:51 PM, Srisatya Pyla <srisp...@in.ibm.com> wrote:
> 
> Thank you very much for quick response. This is very much helpful to us.
> While analyzing the results for some jobs, it is returning high score for a 
> document which is not much relevant to the base document. 
> Is there any way we can improve the results and scoring?  
> How it exactly give the score for matching document based on a matching 
> field?  This is helpful to know why it is giving highest matching score for 
> the specific documents.
> 
> 
> Regards,
> SST  Narasimha Rao Pyla
> IBM Talent Management Solutions
> Mobile :+91 9849315546
> E-mail :srisp...@in.ibm.com   
> 
> 
> IBM Visakha Hills
> Visakhapatnam, AP 530045
> India
> 
> 
> 
> 
> 
> From:        Chee Yee Lim <cheeyee....@gmail.com>
> To:        Srisatya Pyla <srisp...@in.ibm.com>
> Cc:        solr-user@lucene.apache.org, Rajeev Kasarabada1 
> <kasar...@in.ibm.com>, Archana Gavini1 <agavi...@in.ibm.com>
> Date:        13/09/2019 04:32 PM
> Subject:        [EXTERNAL] Re: Need more info on MLT (More Like This) feature
> 
> 
> 
> To use knnSearch, you need to submit a POST request to the Stream request 
> handler.
> 
> Using your example query, you will need to rewrite them from this :
> 
> http://[SOLRURL]/mlt?q=sjkey:1414462-25600-5258&wt=json&indent=true&mlt=true&rows=100&mlt.fl=jobdescription&mlt.mindf=1&mlt.mintf=1&fl=jobtitle,jobdescription&fq=siteid:5258
> 
> to this (using curl as an example to send POST request) :
> 
> curl --data-urlencode 'expr=knnSearch([collection_name],
> id="1414462-25600-5258",
> qf="jobdescription",
> k=100,
> fl="jobtitle,jobdescription,score",
> sort="score desc",
> fq="siteid:5258",
> mintf=1, 
> mindf=1)' http://[SOLRURL]/stream
> 
> Note that this assume your document ID is sjkey.
> 
> More detailed documentation on how Stream handler works can be seen here, 
> https://lucene.apache.org/solr/guide/8_1/streaming-expressions.html.
> 
> Best wishes,
> Chee Yee
> 
> On Fri, 13 Sep 2019 at 17:57, Srisatya Pyla <srisp...@in.ibm.com> wrote:
> Hi Chee Yee Lim,
> 
> 
> Thank you for your quick response.  
> We do not find much documentation on knnsearch on how to do use that.   
> Could you please guide us with more info on how this can be used?
> 
> Can we use this the way we use Solr by querying with Solr URL like   
> http://[SOLR URL]/mlt.... ?  OR any other way?
> And also please provide with any more detailed documentation if you have any.
> 
> 
> Regards,
> SST  Narasimha Rao Pyla
> IBM Talent Management Solutions
> Mobile :+91 9849315546
> E-mail :srisp...@in.ibm.com   
> 
> 
> IBM Visakha Hills
> Visakhapatnam, AP 530045
> India
> 
> 
> 
> 
> 
> 
>  
>  
> ----- Original message -----
> From: Chee Yee Lim <cheeyee....@gmail.com>
> To: solr-user@lucene.apache.org
> Cc: Archana Gavini1 <agavi...@in.ibm.com>, Rajeev Kasarabada1 
> <kasar...@in.ibm.com>
> Subject: [EXTERNAL] Re: Need more info on MLT (More Like This) feature
> Date: Thu, Sep 12, 2019 6:43 PM
>  
> I've been working with MLT handler (Solr 8.1.1) by calling it the same way 
> you did, http://[SOLRURL]/mlt. But the response is very unreliable with 90% 
> of the same queries resulting in Java null pointer exception, and only 10% 
> returning expected response. I do not know what is the cause of this.
>  
> I overcame this problem by using knnSearch via Stream handler 
> (https://lucene.apache.org/solr/guide/8_1/stream-source-reference.html#knnsearch).
>  It is just a wrapper on MLT, and it works brilliantly. It is worth checking 
> it out if you are running Solr in cloud mode.
>  
> If you pass the fl="score"&sort="score desc" to knnSearch, you will be able 
> to get the results sorted by matching scores.
>  
> Best wishes,
> Chee Yee
>   
> On Thu, 12 Sep 2019 at 19:44, Srisatya Pyla <srisp...@in.ibm.com> wrote:
> Hi Solr Seatch Team,
> 
> I am a developer from IBM Kenexa Brassring.  We are using Solr Search engine 
> for searching jobs in our applications.
> We are planning to use MLT feature to get the similar matching documents 
> (jobs) based on one document (job).
> 
> When trying to explore this option, we are using matching field as 
> JobDescription of the job and we are getting some unrelated documents in the 
> MLT results which are not expected.
> 
> The query like below:
> 
> http://[SOLRURL]/mlt?q=sjkey:1414462-25600-5258&wt=json&indent=true&mlt=true&rows=100&mlt.fl=jobdescription&mlt.mindf=1&mlt.mintf=1&fl=jobtitle,jobdescription&fq=siteid:5258
> 
> 
> We have few questions:
> 1) Is there any way we can get the matching score for each of the matching 
> document we get in the MLT results, so that we can get the sorting done on 
> the score to have the highest matching document at the top of the result.
> 
> 2) Are there any best practices using MLT Handler?
> 
> 
> Regards,
> SST  Narasimha Rao Pyla
> IBM Talent Management Solutions
> Mobile :+91 9849315546
> E-mail :srisp...@in.ibm.com   
> 
> 
> IBM Visakha Hills
> Visakhapatnam, AP 530045
> India
> 
>  
>  
> 
>

Re: Need more info on MLT (More Like This) feature

Reply via email to