Continuing the discussion on mailing list from Jira.

An Example


*id      group           f1              f2*1       g1
5               10
2       g1                 5               1000
3       g1                 5               1000
4       g1                 10              100
5       g2                 5               10
6       g2                 5               1000
7       g2                 5               1000
8       g2                10              100

sort= f1 asc, f2 desc , id desc


*Without collapse will give : *
(7,g2), (6,g2),  (3,g1), (2,g1), (5,g2), (1,g1), (8,g2), (4,g1)


*On collapsing by group_s  expected output is : *  (7,g2), (3,g1)

solr standard collapsing does give this output  with
group=on,group.field=group_s,group.main=true

* Collapsing with CollapsingQParserPlugin* fq={!collapse field=group_s} :
  (5,g2), (1,g1)



* Summarizing Jira Discussion :*
1. CollapsingQParserPlugin picks up the group heads from matching results
and passes those further. So in essence filtering some of the matching
documents, so that subsequent collectors never see them. It can also pass
on score to subsequent collectors using a dummy scorer.

2. TopDocCollector comes later in hierarchy and it will sort on the
collapsed set. That works fine.

The issue is with step 1. Collapsing is done by a single comparator which
can take its value from a field or function. It defaults to score.
Function queries do allow us to combine multiple fields / value sources,
however it would be difficult to construct a function for given sort
fields. Primarily because
    a) The range of values for a given sort field is not known in advance.
It is possible for one sort field to unbounded, but other to be bounded
within a small range.
    b) The sort field can itself hold custom logic.

Because of (a) the group head selected by CollapsingQParserPlugin will be
incorrect and subsequent sorting will break.



On 14 June 2014 12:38, Umesh Prasad <umesh.i...@gmail.com> wrote:

> Thanks Joel for the quick response. I have opened a new jira ticket.
>
> https://issues.apache.org/jira/browse/SOLR-6168
>
>
>
>
> On 13 June 2014 17:45, Joel Bernstein <joels...@gmail.com> wrote:
>
>> Let's open a new ticket.
>>
>> Joel Bernstein
>> Search Engineer at Heliosearch
>>
>>
>> On Fri, Jun 13, 2014 at 8:08 AM, Umesh Prasad <umesh.i...@gmail.com>
>> wrote:
>>
>> > The patch in SOLR-5408 fixes the issue with sorting only for two sort
>> > fields. Sorting still breaks when 3 or more sort fields are used.
>> >
>> > I have attached a test case, which demonstrates the broken behavior
>> when 3
>> > sort fields are used.
>> >
>> > The failing test case patch is against Lucene/Solr 4.7 revision  number
>> > 1602388
>> >
>> > Can someone apply and verify the bug ?
>> >
>> > Also, should I re-open SOLR-5408  or open a new ticket ?
>> >
>> >
>> > ---
>> > Thanks & Regards
>> > Umesh Prasad
>> >
>>
>
>
>
> --
> ---
> Thanks & Regards
> Umesh Prasad
>



-- 
---
Thanks & Regards
Umesh Prasad

Reply via email to