Hi,

    Following is the debug query results :

*Solr 3.5*

<lst name="profile_D48699">
  <bool name="match">true</bool>
  <float name="value">60.67038</float>
  <str name="description">sum of:</str>
  <arr name="details">
    <lst>
      <bool name="match">true</bool>
      <float name="value">60.67038</float>
      <str name="description">max plus 1.0 times others of:</str>
      <arr name="details">
        <lst>
          <bool name="match">true</bool>
          <float name="value">0.44362593</float>
          <str name="description">weight(content:cancer^0.5 in 21506339),
product of:</str>
          <arr name="details">
            <lst>
              <bool name="match">true</bool>
              <float name="value">0.009291923</float>
              <str name="description">queryWeight(content:cancer^0.5),
product of:</str>
              <arr name="details">
                <lst>
                  <bool name="match">true</bool>
                  <float name="value">0.5</float>
                  <str name="description">boost</str>
                </lst>
                <lst>
                  <bool name="match">true</bool>
                  <float name="value">3.5684927</float>
                  <str name="description">idf(docFreq=1682287,
maxDocs=21947370)</str>
                </lst>
                <lst>
                  <bool name="match">true</bool>
                  <float name="value">0.005207758</float>
                  <str name="description">queryNorm</str>
                </lst>
              </arr>
            </lst>
            <lst>
              <bool name="match">true</bool>
              <float name="value">47.74318</float>
              <str name="description">fieldWeight(content:cancer in
21506339), product of:</str>
              <arr name="details">
                <lst>
                  <bool name="match">true</bool>
                  <float name="value">13.379088</float>
                 * <str
name="description">tf(termFreq(content:cancer)=179)</str>*
                </lst>
                <lst>
                  <bool name="match">true</bool>
                  <float name="value">3.5684927</float>
                  <str name="description">idf(docFreq=1682287,
maxDocs=21947370)</str>
                </lst>
                <lst>
                  <bool name="match">true</bool>
                  <float name="value">1.0</float>
                  <str name="description">fieldNorm(field=content,
doc=21506339)</str>
                </lst>
              </arr>
            </lst>
          </arr>
        </lst>


*Solr 4.4 debug query :*
*
*
<lst name="profile_D48699">
  <bool name="match">true</bool>
  <float name="value">67.04259</float>
  <str name="description">max plus 1.0 times others of:</str>
  <arr name="details">
    <lst>
      <bool name="match">true</bool>
      <float name="value">0.75314933</float>
      <str name="description">weight(content:cancer^0.5 in 20543947)
[DefaultSimilarity], result of:</str>
      <arr name="details">
        <lst>
          <bool name="match">true</bool>
          <float name="value">0.75314933</float>
          <str name="description">score(doc=20543947,freq=515.0 =
termFreq=515.0 ), product of:</str>
          <arr name="details">
            <lst>
              <bool name="match">true</bool>
              <float name="value">0.009295603</float>
              <str name="description">queryWeight, product of:</str>
              <arr name="details">
                <lst>
                  <bool name="match">true</bool>
                  <float name="value">0.5</float>
                  <str name="description">boost</str>
                </lst>
                <lst>
                  <bool name="match">true</bool>
                  <float name="value">3.5702603</float>
                  <str name="description">idf(docFreq=1678887,
maxDocs=21941764)</str>
                </lst>
                <lst>
                  <bool name="match">true</bool>
                  <float name="value">0.005207241</float>
                  <str name="description">queryNorm</str>
                </lst>
              </arr>
            </lst>
            <lst>
              <bool name="match">true</bool>
              <float name="value">81.0221</float>
              <str name="description">fieldWeight in 20543947, product
of:</str>
              <arr name="details">
                <lst>
                  <bool name="match">true</bool>
                  <float name="value">22.693611</float>
                  <str name="description">tf(freq=515.0), with freq
of:</str>
                  <arr name="details">
                    <lst>
                      <bool name="match">true</bool>
                      <float name="value">515.0</float>
                      <str name="description">termFreq=515.0</str>
                    </lst>
                  </arr>
                </lst>
                <lst>
                  <bool name="match">true</bool>
                  <float name="value">3.5702603</float>
                  <str name="description">idf(docFreq=1678887,
maxDocs=21941764)</str>
                </lst>
                <lst>
                  <bool name="match">true</bool>
                  <float name="value">1.0</float>
                  <str name="description">fieldNorm(doc=20543947)</str>
                </lst>
              </arr>
            </lst>
          </arr>
        </lst>
      </arr>
    </lst>

Search for the term 'cancer' in the field 'content' should me the count to
be 515.

Please let me known if you have any questions or concerns.

Thanks.
Kuchekar, Nilesh


On Fri, Sep 13, 2013 at 12:36 AM, Jack Krupansky <j...@basetechnology.com>wrote:

> There may be some token filters that are emitting a different number of
> terms. There are so many changes between 3.5 and 4.4, that it simply isn't
> worth the trouble to track down all of them. In some cases, there may be
> bugs in 3.5 that have gotten fixed in any of the intervening releases.
>
> Do you have a specific example - the input text and the field and field
> type and analyzer where the tf differs? That should suggest where the
> differences come from.
>
> Do you have any specific reason to believe that one of the counts is more
> right than the other?
>
> -- Jack Krupansky
>
> -----Original Message----- From: Kuchekar
> Sent: Thursday, September 12, 2013 4:50 PM
>
> To: solr-user@lucene.apache.org
> Cc: Stefan Matheis
> Subject: Re: Different Responses for 4.4 and 3.5 solr index
>
> Hi,
>
>     After triaging more for this, we find that the termFrequency (tf) for
> the same field in the same doc in solr 3.5 and 4.4 is different.
>
> example :
>
> If word "fruits" appear in some field for 20 times
>
> In 3.5 tf is reported to be 8, where as in 4.4 solr it reports to be 20.
> that is changing the the score.
>
> Also we see that the function 'idf' which depends upon the max doc is
> changed.
>
> Are there any changes in 'termFrequency' and 'idf' function in solr 4.4
> compared to solr 3.5.
>
> Looking forward for your reply.
>
> Thanks.
> Kuchekar, Nilesh
>
>
> On Thu, Sep 12, 2013 at 11:30 AM, Kuchekar <kuchekar.nil...@gmail.com>**
> wrote:
>
>  Hi,
>>
>>     Any updates on this?. Is ranking computation dependent on the 'maxDoc'
>> value in the solr? Is this happening due to changing value of 'maxDoc'
>> value after each optimization. As in, in solr 4.4 every time optimization
>> is ran, the 'maxDoc' value is reset, where as this is not the case in solr
>> 3.5.
>>
>> Looking forward for the reply.
>>
>> Thanks.
>> Kuchekar, Nilesh
>>
>>
>> On Wed, Aug 28, 2013 at 3:32 PM, Michael Sokolov <
>> msoko...@safaribooksonline.com**> wrote:
>>
>>  We've been seeing changes in our rankings as well.  I don't have a
>>> definite answer yet, since we're waiting on an index rebuild, but our
>>> current working theory is that the change to default omitNorms="true" for
>>> primitive types may have had an effect, possibly due to follow on
>>> confusion: our developers may have omitted norms from some other fields
>>> they shouldn't have?
>>>
>>> -Mike
>>>
>>>
>>> On 08/26/2013 09:46 AM, Stefan Matheis wrote:
>>>
>>>  Did you check the scoring? (use fl=*,score to retrieve it) ..
>>>> additionally debugQuery=true might provide more information about how
>>>> the
>>>> score was calculated.
>>>>
>>>> - Stefan
>>>>
>>>>
>>>> On Monday, August 26, 2013 at 12:46 AM, Kuchekar wrote:
>>>>
>>>>  Hi,
>>>>
>>>>> The response from 4.4 and 3.5 in the current scenario differs in the
>>>>> sequence in which results are given us back.
>>>>>
>>>>> For example :
>>>>>
>>>>> Response from 3.5 solr is : id:A, id:B, id:C, id:D ...
>>>>> Response from 4.4 solr is : id C, id:A, id:D, id:B...
>>>>>
>>>>> Looking forward your reply.
>>>>>
>>>>> Thanks.
>>>>> Kuchekar, Nilesh
>>>>>
>>>>>
>>>>> On Sun, Aug 25, 2013 at 11:32 AM, Stefan Matheis
>>>>> <matheis.ste...@gmail.com (mailto:matheis.stefan@gmail.****com<
>>>>> matheis.ste...@gmail.com>
>>>>>
>>>>> )>wrote:
>>>>>
>>>>>  Kuchekar (hope that's your first name?)
>>>>>
>>>>>>
>>>>>> you didn't tell us .. how they differ? do you get an actual error? or
>>>>>> does
>>>>>> the result contain documents you didn't expect? or the other way
>>>>>> round,
>>>>>> that some are missing you'd expect to be there?
>>>>>>
>>>>>> - Stefan
>>>>>>
>>>>>>
>>>>>> On Sunday, August 25, 2013 at 4:43 PM, Kuchekar wrote:
>>>>>>
>>>>>>  Hi,
>>>>>>
>>>>>>>
>>>>>>> We get different response when we query 4.4 and 3.5 solr using same
>>>>>>> query params.
>>>>>>>
>>>>>>> My query param are as following :
>>>>>>>
>>>>>>> facet=true
>>>>>>> &facet.mincount=1
>>>>>>> &facet.limit=25
>>>>>>>
>>>>>>>  &qf=content^0.0+p_last_name^****500.0+p_first_name^50.0+**
>>>>>>>
>>>>>> strong_topic^0.0+first_author_****topic^0.0+last_author_topic^**0.**
>>>>>> 0+title_topic^0.0
>>>>>>
>>>>>>  &wt=javabin
>>>>>>> &version=2
>>>>>>> &rows=10
>>>>>>> &f.affiliation_org.facet.****limit=150
>>>>>>> &fl=p_id,p_first_name,p_last_****name
>>>>>>>
>>>>>>> &start=0
>>>>>>> &q=Apple
>>>>>>> &facet.field=affiliation_org
>>>>>>> &fq=table:profile
>>>>>>> &fq=num_content:[*+TO+1500]
>>>>>>> &fq=name:"Apple"
>>>>>>>
>>>>>>> The content in both (solr 4.4 and solr 3.5) are same.
>>>>>>>
>>>>>>> The solrconfig.xml from 3.5 an 4.4 are similarly constructed.
>>>>>>>
>>>>>>> Is there something I am missing that might have been changed in 4.4,
>>>>>>>
>>>>>>>  which
>>>>>>
>>>>>>  might be causing this issue. ?. The "qf" params looks same.
>>>>>>>
>>>>>>> Looking forward for your reply.
>>>>>>>
>>>>>>> Thanks.
>>>>>>> Kuchekar, Nilesh
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to