Hi all,

May be it is better to move the discussion into a jira ticket.
I created SOLR-8884 for this. 

aHmet

On Tuesday, March 22, 2016 1:59 PM, Alessandro Benedetti 
<abenede...@apache.org> wrote:



I got this problem re-ranking.
But in my short  experience I was not able to reproduce nor fix the bug.
Can I ask you the query aprser used and all the components involved in the
query ?

Cheers

On Mon, Mar 21, 2016 at 8:40 PM, Rick Sullivan <r...@ricksullivan.net>
wrote:

> I haven't checked this thread since Friday, but here are my responses to
> the questions that have come up.
>
> 1. How is ranking affected?
>
> Some documents have their scores divided by an integer value in the
> response documents.
>
> 2. Do you see the proper ranking in the explain section?
>
> Yes, the explain section always seems to have consistent values and proper
> rankings.
>
> 3. What about the results?
>
> No, these are ranked according to the sometimes incorrect score.
>
> 4. What version of Solr are you using?
>
> I've produced the problem on SolrCloud 5.5.0 (2 shards on 2 nodes on the
> same machine), Solr 5.5.0 (no sharding), and Solr 5.4.1 (no sharding).
> I've also had trouble reproducing the problem on test data.
>
> Thanks,
> -Rick
>
> ----------------------------------------
> > Date: Mon, 21 Mar 2016 14:14:44 +0000
> > From: iori...@yahoo.com.INVALID
> > To: solr-user@lucene.apache.org
> > Subject: Re: Explain score is different from score
> >
> >
> >
> > Hi Alessandro,
> >
> > OP have different ranking: fl=score and explain's score would have
> retrieve different orders.
> > I wrote test cases using ClassicSimilarity, but it won't re-produce.
> > This is really weird. I wonder what is triggering this.
> >
> > aHmet
> >
> >
> > On Monday, March 21, 2016 2:08 PM, Alessandro Benedetti <
> abenede...@apache.org> wrote:
> >
> >
> >
> > I would like to add a question, how the ranking is affected ?
> > Do you see the proper ranking in the explain section ?
> > And what about the results ? Are they ranked accordingly the correct
> score,
> > or they are ranked by the wrong score ?
> > I got a similar issue, which I am not able to reproduce yet, but it was
> > really really weird ( in my case I got also the ranking messed up_
> >
> > Cheers
> >
> >
> > On Mon, Mar 21, 2016 at 7:30 AM, G, Rajesh <r...@cebglobal.com> wrote:
> >
> >> Hi Ahmet,
> >>
> >> I am using solr 5.5.0. I am running single instance with single core. No
> >> shards
> >>
> >> I have added <similarity class="solr.BM25SimilarityFactory"/> to my
> schema
> >> as suggested by Rick Sullivan. Now the scores are same between explain
> and
> >> score field.
> >>
> >> But instead of previous results "Lync - Microsoft Office 365" and
> >> "Microsoft Office 365" I am getting
> >>
> >> {
> >> "title":"Office 365",
> >> "score":7.471676
> >> },
> >> {
> >> "title":"Office 365",
> >> "score":7.471676
> >> },
> >>
> >> If I try NGram title:(Microsoft Ofice 365)
> >>
> >> The scores are same for top 10 results even though they are differing by
> >> min of 3 characters. I have attached my schema.xml so it can help
> >>
> >> <doc>
> >> <str name="title">Lync - Microsoft Office 365</str>
> >> <float name="score">52.056263</float></doc>
> >> <doc>
> >> <str name="title">Microsoft Office 365</str>
> >> <float name="score">52.056263</float></doc>
> >> <doc>
> >> <str name="title">Microsoft Office 365 1.0</str>
> >> <float name="score">52.056263</float></doc>
> >> <doc>
> >> <str name="title">Microsoft Office 365 14.0</str>
> >> <float name="score">52.056263</float></doc>
> >> <doc>
> >> <str name="title">Microsoft Office 365 14.3</str>
> >> <float name="score">52.056263</float></doc>
> >> <doc>
> >> <str name="title">Microsoft Office 365 14.4</str>
> >> <float name="score">52.056263</float></doc>
> >> <doc>
> >> <str name="title">Microsoft Office 365 14.5(Mac)</str>
> >> <float name="score">52.056263</float></doc>
> >> <doc>
> >> <str name="title">Microsoft Office 365 15.0</str>
> >> <float name="score">52.056263</float></doc>
> >> <doc>
> >> <str name="title">Microsoft Office 365 16.0</str>
> >> <float name="score">52.056263</float></doc>
> >> <doc>
> >> <str name="title">Microsoft Office 365 4.0</str>
> >> <float name="score">52.056263</float></doc>
> >> <doc>
> >> <str name="title">Microsoft Office 365 E4</str>
> >> <float name="score">52.056263</float></doc>
> >> <doc>
> >> <str name="title">Microsoft Mail Protection Reports for Office 365
> >> 15.0</str>
> >> <float name="score">50.215454</float></doc>
> >>
> >> Thanks
> >> Rajesh
> >>
> >>
> >>
> >> Corporate Executive Board India Private Limited. Registration No:
> >> U741040HR2004PTC035324. Registered office: 6th Floor, Tower B, DLF
> Building
> >> No.10 DLF Cyber City, Gurgaon, Haryana-122002, India.
> >>
> >> This e-mail and/or its attachments are intended only for the use of the
> >> addressee(s) and may contain confidential and legally privileged
> >> information belonging to CEB and/or its subsidiaries, including CEB
> >> subsidiaries that offer SHL Talent Measurement products and services. If
> >> you have received this e-mail in error, please notify the sender and
> >> immediately, destroy all copies of this email and its attachments. The
> >> publication, copying, in whole or in part, or use or dissemination in
> any
> >> other way of this e-mail and attachments by anyone other than the
> intended
> >> person(s) is prohibited.
> >>
> >> -----Original Message-----
> >> From: Ahmet Arslan [mailto:iori...@yahoo.com]
> >> Sent: Sunday, March 20, 2016 2:10 AM
> >> To: solr-user@lucene.apache.org; G, Rajesh <r...@cebglobal.com>;
> >> r...@ricksullivan.net
> >> Subject: Re: Explain score is different from score
> >>
> >> Hi Rick and Rajesh,
> >>
> >> I wasn't able re-produce this neither with lucene nor solr.
> >> What version of solr is this?
> >> Are you using a sharded request?
> >>
> >> @BeforeClass
> >> public static void beforeClass() throws Exception {
> >> initCore("solrconfig.xml", "schema.xml");
> >>
> >> assertU(adoc("id", "1722669", "title", "Lync - Microsoft Office 365"));
> >> assertU(adoc("id", "2043876", "title", "Microsoft Office 365"));
> >>
> >> assertU(commit());
> >>
> >> }
> >>
> >> /**
> >> * Checks whether fl=score equals to Explain's score */ @Test public void
> >> testExplain() throws Exception { SolrQueryRequest req =
> >> req(CommonParams.DEBUG_QUERY, "true", "indent", "true", "q",
> >> "title:(Microsoft Ofice 365)", CommonParams.FL, "id,title,score");
> String
> >> response = h.query(req); System.out.println(response); }
> >>
> >> @Test
> >> public void testExplain() throws Exception {
> >>
> >> Analyzer analyzer = new WhitespaceAnalyzer();
> >>
> >> Directory directory = new RAMDirectory();
> >>
> >> IndexWriterConfig config = new IndexWriterConfig(analyzer);
> >> config.setSimilarity(new ClassicSimilarity()); IndexWriter iwriter = new
> >> IndexWriter(directory, config);
> >>
> >> Document doc = new Document();
> >> doc.add(new Field("id", "1722669", TextField.TYPE_STORED)); doc.add(new
> >> Field("title", "Lync - Microsoft Office 365", TextField.TYPE_STORED));
> >> iwriter.addDocument(doc);
> >>
> >> doc = new Document();
> >> doc.add(new Field("id", "2043876", TextField.TYPE_STORED)); doc.add(new
> >> Field("title", "Microsoft Office 365", TextField.TYPE_STORED));
> >> iwriter.addDocument(doc);
> >>
> >>
> >> iwriter.close();
> >>
> >> // Now search the index:
> >> DirectoryReader reader = DirectoryReader.open(directory); IndexSearcher
> >> searcher = new IndexSearcher(reader); searcher.setSimilarity(new
> >> ClassicSimilarity());
> >>
> >> QueryParser parser = new QueryParser("title", analyzer); Query query =
> >> parser.parse("Microsoft Ofice 365"); ScoreDoc[] hits =
> >> searcher.search(query, 10).scoreDocs;
> >>
> >> Assert.assertEquals(2, hits.length);
> >>
> >> // Iterate through the results:
> >> for (int i = 0; i < hits.length; i++) {
> >>
> >> Document hitDoc = searcher.doc(hits[i].doc); Explanation explanation =
> >> searcher.explain(query, hits[i].doc);
> >>
> >> Assert.assertEquals("score from explain should equal to
> ScoreDoc.score!",
> >> hits[i].score, explanation.getValue(), 0.0);
> >>
> >> }
> >>
> >>
> >> reader.close();
> >> directory.close();
> >>
> >> }
> >>
> >>
> >>
> >>
> >>
> >> On Saturday, March 19, 2016 7:54 AM, "G, Rajesh" <r...@cebglobal.com>
> wrote:
> >> I don’t use boost at index time and query time.
> >>
> >>
> >>
> >> Corporate Executive Board India Private Limited. Registration No:
> >> U741040HR2004PTC035324. Registered office: 6th Floor, Tower B, DLF
> Building
> >> No.10 DLF Cyber City, Gurgaon, Haryana-122002, India.
> >>
> >> This e-mail and/or its attachments are intended only for the use of the
> >> addressee(s) and may contain confidential and legally privileged
> >> information belonging to CEB and/or its subsidiaries, including CEB
> >> subsidiaries that offer SHL Talent Measurement products and services. If
> >> you have received this e-mail in error, please notify the sender and
> >> immediately, destroy all copies of this email and its attachments. The
> >> publication, copying, in whole or in part, or use or dissemination in
> any
> >> other way of this e-mail and attachments by anyone other than the
> intended
> >> person(s) is prohibited.
> >>
> >>
> >> -----Original Message-----
> >> From: Rick Sullivan [mailto:r...@ricksullivan.net]
> >> Sent: Friday, March 18, 2016 10:18 PM
> >> To: solr-user@lucene.apache.org
> >> Subject: RE: Explain score is different from score
> >>
> >> I'm not. I only have query boosts.
> >>
> >> ----------------------------------------
> >>> Date: Fri, 18 Mar 2016 16:42:36 +0000
> >>> From: iori...@yahoo.com.INVALID
> >>> To: solr-user@lucene.apache.org
> >>> Subject: Re: Explain score is different from score
> >>>
> >>> Hi Rick,
> >>>
> >>> This could be a bug I think. Do you guys use index time boosts?
> >>>
> >>> Ahmet
> >>>
> >>>
> >>>
> >>> On Friday, March 18, 2016 6:15 PM, Rick Sullivan <
> r...@ricksullivan.net>
> >> wrote:
> >>> Yes it seems to be something similar, but the normalization isn't
> >> applied to all retrieved documents, which messes with the document
> rankings.
> >>>
> >>> Some documents have the exact values from the 'explain' response, while
> >> others are normalized.
> >>>
> >>> -Rick
> >>>
> >>>
> >>> ----------------------------------------
> >>>> Date: Fri, 18 Mar 2016 16:06:19 +0000
> >>>> From: iori...@yahoo.com.INVALID
> >>>> To: solr-user@lucene.apache.org
> >>>> Subject: Re: Explain score is different from score
> >>>>
> >>>> Hi Rajesh,
> >>>>
> >>>> I suspect it is due to the queryNorm(q). But it is weird that relative
> >> order is different in your example.
> >>>>
> >>>>
> >>>> "queryNorm(q) is a normalizing factor used to make scores between
> >>>> queries comparable. This factor does not affect document ranking
> >>>> (since all ranked documents are multiplied by the same factor), but
> >>>> rather just attempts to make scores from different queries (or even
> >>>> different indexes) comparable." [1]
> >>>>
> >>>> [1]
> >>>> https://lucene.apache.org/core/5_5_0/core/org/apache/lucene/search/si
> >>>> milarities/TFIDFSimilarity.html
> >>>>
> >>>> Ahmet
> >>>>
> >>>>
> >>>> On Friday, March 18, 2016 4:24 PM, Rick Sullivan <
> r...@ricksullivan.net>
> >> wrote:
> >>>> Hi Rajesh,
> >>>>
> >>>> I've been seeing the same problem you have. My debug scores seem to be
> >> what I expect, but the actual scores applied by Solr are sometimes
> divided
> >> by an integer.
> >>>>
> >>>> I raised the same question in this email distribution about a week
> ago,
> >> but haven't yet found a solution. There's also a StackOverflow question
> I
> >> created here:
> >>>> http://stackoverflow.com/questions/35921106/how-and-why-do-solr-expla
> >>>> in-values-differ-from-the-solr-score
> >>>>
> >>>> Can you verify whether all of your affected scores are (1/N)*score? I
> >>>> think N seems to be the number of OR elements in the query. For
> >>>> example, your case below has
> >>>>
> >>>> debug_score/score
> >>>> = 1.2517526/0.41725087
> >>>> = 3
> >>>>
> >>>> Thanks,
> >>>> -Rick
> >>>>
> >>>>
> >>>> ----------------------------------------
> >>>>> From: r...@cebglobal.com
> >>>>> To: solr-user@lucene.apache.org
> >>>>> Subject: RE: Explain score is different from score
> >>>>> Date: Fri, 18 Mar 2016 13:29:14 +0000
> >>>>>
> >>>>> Can someone help?
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> Corporate Executive Board India Private Limited. Registration No:
> >> U741040HR2004PTC035324. Registered office: 6th Floor, Tower B, DLF
> Building
> >> No.10 DLF Cyber City, Gurgaon, Haryana-122002, India..
> >>>>>
> >>>>>
> >>>>>
> >>>>> This e-mail and/or its attachments are intended only for the use of
> >> the addressee(s) and may contain confidential and legally privileged
> >> information belonging to CEB and/or its subsidiaries, including CEB
> >> subsidiaries that offer SHL Talent Measurement products and services. If
> >> you have received this e-mail in error, please notify the sender and
> >> immediately, destroy all copies of this email and its attachments. The
> >> publication, copying, in whole or in part, or use or dissemination in
> any
> >> other way of this e-mail and attachments by anyone other than the
> intended
> >> person(s) is prohibited.
> >>>>>
> >>>>>
> >>>>> From: G, Rajesh
> >>>>> Sent: Friday, March 18, 2016 12:56 PM
> >>>>> To: solr-user@lucene.apache.org
> >>>>> Subject: Explain score is different from score
> >>>>>
> >>>>> Mismatch in score displayed in debug and score field. Please refer
> >> attached xml.
> >>>>>
> >>>>> When I search for title_ws:(Microsoft Ofice 365). If the results are
> >> displayed by explain score order then we would have the expected result
> >> “Microsoft Office 365” then “Lync - Microsoft Office 365”
> >>>>>
> >>>>> <result name="response" numFound="13617" start="0"
> >>>>> maxScore="1.0952835"> <doc> <str name="title">Lync - Microsoft
> >>>>> Office 365</str> <str name="title_ws">Lync - Microsoft Office
> >>>>> 365</str> <int name="id">1722669</int> <float
> >>>>> name="score">1.0952835</float></doc> Score from explain 1.0952835
> >>>>> <doc> <str name="title">Microsoft Office 365</str> <str
> >>>>> name="title_ws">Microsoft Office 365</str> <int
> >>>>> name="id">2043876</int> <float name="score">0.41725087</float></doc>
> >>>>> Score from explain 1.2517526 </result>
> >>>>>
> >>>>> Thanks
> >>>>> Rajesh
> >>
> >
> >
> >
> > --
> > --------------------------
> >
> > Benedetti Alessandro
> > Visiting card : http://about.me/alessandro_benedetti
> >
> > "Tyger, tyger burning bright
> > In the forests of the night,
> > What immortal hand or eye
> > Could frame thy fearful symmetry?"
> >
> > William Blake - Songs of Experience -1794 England

>
>



-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Reply via email to