Is there a Solr wiki that discusses these issues, such as "Groups can't
cross shard boundaries"? Seems like it should be highlighted prominently,
maybe here:
http://wiki.apache.org/solr/FieldCollapsing
Seems like it should be mentioned on the distributed/SolrCloud wiki(s) as
well.
Is this a "distributed IDF" type of issue or something else? Is this an
outright bug or an (insurmountable?) limitation?
I did notice SOLR-2066, but didn't see mention of the limitation. Are there
any other limitations for distributed grouping?
-- Jack Krupansky
-----Original Message-----
From: Martijn v Groningen
Sent: Monday, June 11, 2012 8:53 AM
To: solr-user@lucene.apache.org
Subject: Re: Issue with field collapsing in solr 4 while performing
distributed search
The ngroups returns the number of groups that have matched with the
query. However if you want ngroups to be correct in a distributed
environment you need
to put document belonging to the same group into the same shard.
Groups can't cross shard boundaries. I guess you need to do
some manual document partitioning.
Martijn
On 11 June 2012 14:29, Nitesh Nandy <niteshna...@gmail.com> wrote:
Version: Solr 4.0 (svn build 30th may, 2012) with Solr Cloud (2 slices
and
2 shards)
The setup was done as per the wiki: http://wiki.apache.org/solr/SolrCloud
We are doing distributed search. While querying, we use field collapsing
with "ngroups" set as true as we need the number of search results.
However, there is a difference in the number of "result list" returned and
the "ngroups" value returned.
Ex:
http://localhost:8983/solr/select?q=message:blah%20AND%20userid:3&&group=true&group.field=id&group.ngroups=true
The response XMl looks like
<response>
<script/>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">46</int>
<lst name="params">
<str name="group.field">id</str>
<str name="group.ngroups">true</str>
<str name="group">true</str>
<str name="q">messagebody:monit AND usergroupid:3</str>
</lst>
</lst>
<lst name="grouped">
<lst name="id">
<int name="matches">10</int>
<int name="ngroups">9</int>
<arr name="groups">
<lst>
<str name="groupValue">320043</str>
<result name="doclist" numFound="1" start="0">
<doc>...</doc>
</result>
</lst>
<lst>
<str name="groupValue">398807</str>
<result name="doclist" numFound="5" start="0" maxScore="2.4154348">...
</result>
</lst>
<lst>
<str name="groupValue">346878</str>
<result name="doclist" numFound="2" start="0">...</result>
</lst>
<lst>
<str name="groupValue">346880</str>
<result name="doclist" numFound="2" start="0">...</result>
</lst>
</arr>
</lst>
</lst>
</response>
So you can see that the ngroups value returned is 9 and the actual number
of groups returned is 4
Why do we have this discrepancy in the ngroups, matches and actual number
of groups. Is this an open issue ?
Any kind of help is appreciated.
--
Regards,
Nitesh Nandy
--
Met vriendelijke groet,
Martijn van Groningen