Hi Varun, Yes, after going over the code I think you are right. If you change the following if block in SolrIndexSearcher.getDocSet(Query query, DocSet filter, DocSetAwareCollector collector): if (first==null) { first = getDocSetNC(absQ, null); filterCache.put(absQ,first); } with: if (first==null) { first = getDocSetNC(absQ, null, collector); filterCache.put(absQ,first); } It should work then. Let me know if this solves your problem.
Martijn 2009/12/18 Varun Gupta <varun.vgu...@gmail.com>: > After a lot of debugging, I finally found why the order of collapse results > are not matching the uncollapsed results. I can't say if it is a bug in the > implementation of fieldcollapse or not. > > *Explaination:* > Actually, I am querying the fieldcollapse with some filters to restrict the > collapsing to some particular categories only by appending the parameter: > fq=ctype:(1+2+8+6+3). > > In: NonAdjacentDocumentCollapser.doQuery() > Line: DocSet filter = searcher.getDocSet(filterQueries); > > Here, filter docset is got without any scores (since I have filter in my > query, this line actually gets executed) and also stored in the filter > cache. In the next line in the code, the actual uncollapsed DocSet is got > passing the DocSetScoreCollector. > > Now, in: SolrIndexSearcher.getDocSet(Query query, DocSet filter, > DocSetAwareCollector collector) > Line: if (filterCache != null) > Because of the filter cache not being null, and no result for the query in > the cache, the line: first = getDocSetNC(absQ,null); gets executed. Notice, > over here the DocSetScoreCollector is not passed. Hence, results are > collected without any scores. > > This makes the uncollapsedDocSet to be without any scores and hence the > sorting is not done based on score. > > @Martijn: Is what I am right or I should use field collapsing in some other > way. Else, what is the ideal fix for this problem (I am not an active > developer, so can't say the fix that I do will not break anything). > > -- > Thanks, > Varun Gupta > > > On Mon, Dec 14, 2009 at 10:35 AM, Varun Gupta <varun.vgu...@gmail.com>wrote: > >> When I used collapse.threshold=1, out of the 5 categories 4 had the same >> top result, but 1 category had a different result (it was the 3rd result >> coming for that category when I used threshold as 3). >> >> -- >> Thanks, >> Varun Gupta >> >> >> >> On Mon, Dec 14, 2009 at 2:56 AM, Martijn v Groningen < >> martijn.is.h...@gmail.com> wrote: >> >>> I would not expect that Solr 1.4 build is the cause of the problem. >>> Just out of curiosity does the same happen when collapse.threshold=1? >>> >>> 2009/12/11 Varun Gupta <varun.vgu...@gmail.com>: >>> > Here is the field type configuration of ctype: >>> > <field name="ctype" type="integer" indexed="true" stored="true" >>> > omitNorms="true" /> >>> > >>> > In solrconfig.xml, this is how I am enabling field collapsing: >>> > <searchComponent name="query" >>> > class="org.apache.solr.handler.component.CollapseComponent"/> >>> > >>> > Apart from this, I made no changes in solrconfig.xml for field collapse. >>> I >>> > am currently not using the field collapse cache. >>> > >>> > I have applied the patch on the Solr 1.4 build. I am not using the >>> latest >>> > solr nightly build. Can that cause any problem? >>> > >>> > -- >>> > Thanks >>> > Varun Gupta >>> > >>> > >>> > On Fri, Dec 11, 2009 at 3:44 AM, Martijn v Groningen < >>> > martijn.is.h...@gmail.com> wrote: >>> > >>> >> I tried to reproduce a similar situation here, but I got the expected >>> >> and correct results. Those three documents that you saw in your first >>> >> search result should be the first in your second search result (unless >>> >> the index changes or the sort changes ) when fq on that specific >>> >> category. I'm not sure what is causing this problem. Can you give me >>> >> some more information like the field type configuration for the ctype >>> >> field and how have configured field collapsing? >>> >> >>> >> I did find another problem to do with field collapse caching. The >>> >> collapse.threshold or collapse.maxdocs parameters are not taken into >>> >> account when caching, which is off course wrong because they do matter >>> >> when collapsing. Based on the information you have given me this >>> >> caching problem is not the cause of the situation you have. I will >>> >> update the patch that fixes this problem shortly. >>> >> >>> >> Martijn >>> >> >>> >> 2009/12/10 Varun Gupta <varun.vgu...@gmail.com>: >>> >> > Hi Martijn, >>> >> > >>> >> > I am not sending the collapse parameters for the second query. Here >>> are >>> >> the >>> >> > queries I am using: >>> >> > >>> >> > *When using field collapsing (searching over all categories):* >>> >> > >>> >> >>> spellcheck=true&collapse.info.doc=true&facet=true&collapse.threshold=3&facet.mincount=1&spellcheck.q=weight+loss&collapse.facet=before&wt=xml&f.content.hl.snippets=2&hl=true&version=2.2&rows=20&collapse.field=ctype&fl=id,sid,title,image,ctype,score&start=0&q=weight+loss&collapse.info.count=false&facet.field=ctype&qt=contentsearch >>> >> > >>> >> > categories is represented as the field "ctype" above. >>> >> > >>> >> > *Without using field collapsing:* >>> >> > >>> >> >>> spellcheck=true&facet=true&facet.mincount=1&spellcheck.q=weight+loss&wt=xml&hl=true&rows=10&version=2.2&fl=id,sid,title,image,ctype,score&start=0&q=weight+loss&facet.field=ctype&qt=contentsearch >>> >> > >>> >> > I append "&fq=ctype:1" to the above queries when trying to get >>> results >>> >> for a >>> >> > particular category. >>> >> > >>> >> > -- >>> >> > Thanks >>> >> > Varun Gupta >>> >> > >>> >> > >>> >> > On Thu, Dec 10, 2009 at 5:58 PM, Martijn v Groningen < >>> >> > martijn.is.h...@gmail.com> wrote: >>> >> > >>> >> >> Hi Varun, >>> >> >> >>> >> >> Can you send the whole requests (with params), that you send to Solr >>> >> >> for both queries? >>> >> >> In your situation the collapse parameters only have to be used for >>> the >>> >> >> first query and not the second query. >>> >> >> >>> >> >> Martijn >>> >> >> >>> >> >> 2009/12/10 Varun Gupta <varun.vgu...@gmail.com>: >>> >> >> > Hi, >>> >> >> > >>> >> >> > I have documents under 6 different categories. While searching, I >>> want >>> >> to >>> >> >> > show 3 documents from each category along with a link to see all >>> the >>> >> >> > documents under a single category. I decided to use field >>> collapsing >>> >> so >>> >> >> that >>> >> >> > I don't have to make 6 queries (one for each category). Currently >>> I am >>> >> >> using >>> >> >> > the field collapsing patch uploaded on 29th Nov. >>> >> >> > >>> >> >> > Now, the results that are coming after using field collapsing are >>> not >>> >> >> > matching the results for a single category. For example, for >>> category >>> >> C1, >>> >> >> I >>> >> >> > am getting results R1, R2 and R3 using field collapsing, but after >>> I >>> >> see >>> >> >> > results only from the category C1 (without using field collapsing) >>> >> these >>> >> >> > results are nowhere in the first 10 results. >>> >> >> > >>> >> >> > Am I doing something wrong or using the field collapsing for the >>> wrong >>> >> >> > feature? >>> >> >> > >>> >> >> > I am using the following field collapsing parameters while >>> querying: >>> >> >> > collapse.field=category >>> >> >> > collapse.facet=before >>> >> >> > collapse.threshold=3 >>> >> >> > >>> >> >> > -- >>> >> >> > Thanks >>> >> >> > Varun Gupta >>> >> >> > >>> >> >> >>> >> >> >>> >> >> >>> >> >> -- >>> >> >> Met vriendelijke groet, >>> >> >> >>> >> >> Martijn van Groningen >>> >> >> >>> >> > >>> >> >>> >> >>> >> >>> >> -- >>> >> Met vriendelijke groet, >>> >> >>> >> Martijn van Groningen >>> >> >>> > >>> >>> >>> >>> -- >>> Met vriendelijke groet, >>> >>> Martijn van Groningen >>> >> >> > -- Met vriendelijke groet, Martijn van Groningen