There's work on the patch that is being done now which will enable you to ask for specific field values of the collapsed documents using a dedicated request parameter. This work is not committed yet to the latest patch, but will be very soon. There is of course a drawback to that as well, the collapsed documents set can be very large (depends on your data of course) in which case the returned result which includes the fields values can be rather large, which will impact performance, this is why this feature will be enabled only if you specify this extra parameter - by default no field values will be returned.

AFAIK, the latest patch should work fine with the latest build. Martijn (which is the main maintainer of this patch) tries to keep it up to date with the latest builds. But I guess the safest way is to work with the nightly build of the same date as the latest patch (though I would give it a try first with the latest build).

BTW, it's not an official suggestion from the Solr development team, but if you ask me, if you have to choose now whether to use 1.3 or 1.4-dev, I would go for the later. 1.4 is supposed to be released in the upcoming week or two and it bring loads of bug fixes, enhancements and extra functionality. But again, this is my personal suggestion.

cheers,
Uri

R. Tan wrote:
Okay. Thanks for giving an insight on how it works in general. Without
trying it myself, are the field values for the collapsed ones also part of
the results data?
What is the latest build that is safe to use on a production environment?
I'd probably go for that and use field collapsing.

Thank you very much.


On Fri, Sep 4, 2009 at 4:49 AM, Uri Boness <ubon...@gmail.com> wrote:

The collapsed documents are represented by one "master" document which can
be part of the normal search result (the doc list), so pagination just works
as expected, meaning taking only the returned documents in account (ignoring
the collapsed ones). As for the scoring, the "master" document is actually
the document with the highest score in the collapsed group.

As for Solr 1.3 compatibility... well... it's very hart to tell. All latest
patch are certainly *not* 1.3 compatible (I think they're also depending on
some changes in lucene which are not available for solr 1.3). I guess you'll
have to try some of the old patches, but I'm not sure about their stability.

cheers,
Uri


R. Tan wrote:

Thanks Uri. How does paging and scoring work when using field collapsing?
What patch works with 1.3? Is it production ready?

R


On Thu, Sep 3, 2009 at 3:54 PM, Uri Boness <ubon...@gmail.com> wrote:



The development on this patch is quite active. It works well for single
solr instance, but distributed search (ie. shards) is not yet supported.
Using this page you can group search results based on a specific field.
There are two flavors of field collapsing - adjacent and non-adjacent,
the
former collapses only document which happen to be located next to each
other
in the otherwise-non-collapsed results set. The later (the non-adjacent)
one
collapses all documents with the same field value (regardless of their
position in the otherwise-non-collapsed results set). Note, that
non-adjacent performs better than adjacent one. There's currently
discussion
to extend this support so in addition to collapsing the documents, extra
information will be returned for the collapsed documents (see the
discussion
on the issue page).

Uri


R. Tan wrote:



I think this is what I'm looking for. What is the status of this patch?

On Thu, Sep 3, 2009 at 12:00 PM, R. Tan <tanrihae...@gmail.com> wrote:





Hi Solrers,
I would like to get your opinion on how to best approach a search
requirement that I have. The scenario is I have a set of business
listings
that may be group into one parent business (such as 7-eleven having
several
locations). On the results page, I only want 7-eleven to show up once
but
also show how many locations matched the query (facet filtered by
state,
for
example) and maybe a preview of the some of the locations.

Searching for the business name is straightforward but the locations
within
the a result is quite tricky. I can do the opposite, searching for the
locations and faceting on business names, but it will still basically
be
the
same thing and repeat results with the same business name.

Any advice?

Thanks,
R







Reply via email to