There may be some token filters that are emitting a different number of
terms. There are so many changes between 3.5 and 4.4, that it simply isn't
worth the trouble to track down all of them. In some cases, there may be
bugs in 3.5 that have gotten fixed in any of the intervening releases.
Do you have a specific example - the input text and the field and field type
and analyzer where the tf differs? That should suggest where the differences
come from.
Do you have any specific reason to believe that one of the counts is more
right than the other?
-- Jack Krupansky
-----Original Message-----
From: Kuchekar
Sent: Thursday, September 12, 2013 4:50 PM
To: solr-user@lucene.apache.org
Cc: Stefan Matheis
Subject: Re: Different Responses for 4.4 and 3.5 solr index
Hi,
After triaging more for this, we find that the termFrequency (tf) for
the same field in the same doc in solr 3.5 and 4.4 is different.
example :
If word "fruits" appear in some field for 20 times
In 3.5 tf is reported to be 8, where as in 4.4 solr it reports to be 20.
that is changing the the score.
Also we see that the function 'idf' which depends upon the max doc is
changed.
Are there any changes in 'termFrequency' and 'idf' function in solr 4.4
compared to solr 3.5.
Looking forward for your reply.
Thanks.
Kuchekar, Nilesh
On Thu, Sep 12, 2013 at 11:30 AM, Kuchekar <kuchekar.nil...@gmail.com>wrote:
Hi,
Any updates on this?. Is ranking computation dependent on the 'maxDoc'
value in the solr? Is this happening due to changing value of 'maxDoc'
value after each optimization. As in, in solr 4.4 every time optimization
is ran, the 'maxDoc' value is reset, where as this is not the case in solr
3.5.
Looking forward for the reply.
Thanks.
Kuchekar, Nilesh
On Wed, Aug 28, 2013 at 3:32 PM, Michael Sokolov <
msoko...@safaribooksonline.com> wrote:
We've been seeing changes in our rankings as well. I don't have a
definite answer yet, since we're waiting on an index rebuild, but our
current working theory is that the change to default omitNorms="true" for
primitive types may have had an effect, possibly due to follow on
confusion: our developers may have omitted norms from some other fields
they shouldn't have?
-Mike
On 08/26/2013 09:46 AM, Stefan Matheis wrote:
Did you check the scoring? (use fl=*,score to retrieve it) ..
additionally debugQuery=true might provide more information about how
the
score was calculated.
- Stefan
On Monday, August 26, 2013 at 12:46 AM, Kuchekar wrote:
Hi,
The response from 4.4 and 3.5 in the current scenario differs in the
sequence in which results are given us back.
For example :
Response from 3.5 solr is : id:A, id:B, id:C, id:D ...
Response from 4.4 solr is : id C, id:A, id:D, id:B...
Looking forward your reply.
Thanks.
Kuchekar, Nilesh
On Sun, Aug 25, 2013 at 11:32 AM, Stefan Matheis
<matheis.ste...@gmail.com
(mailto:matheis.stefan@gmail.**com<matheis.ste...@gmail.com>
)>wrote:
Kuchekar (hope that's your first name?)
you didn't tell us .. how they differ? do you get an actual error? or
does
the result contain documents you didn't expect? or the other way
round,
that some are missing you'd expect to be there?
- Stefan
On Sunday, August 25, 2013 at 4:43 PM, Kuchekar wrote:
Hi,
We get different response when we query 4.4 and 3.5 solr using same
query params.
My query param are as following :
facet=true
&facet.mincount=1
&facet.limit=25
&qf=content^0.0+p_last_name^**500.0+p_first_name^50.0+**
strong_topic^0.0+first_author_**topic^0.0+last_author_topic^0.**
0+title_topic^0.0
&wt=javabin
&version=2
&rows=10
&f.affiliation_org.facet.**limit=150
&fl=p_id,p_first_name,p_last_**name
&start=0
&q=Apple
&facet.field=affiliation_org
&fq=table:profile
&fq=num_content:[*+TO+1500]
&fq=name:"Apple"
The content in both (solr 4.4 and solr 3.5) are same.
The solrconfig.xml from 3.5 an 4.4 are similarly constructed.
Is there something I am missing that might have been changed in 4.4,
which
might be causing this issue. ?. The "qf" params looks same.
Looking forward for your reply.
Thanks.
Kuchekar, Nilesh