On Feb 26, 2009, at 11:16 PM, CIF Search wrote:
I believe the query component will generate the query in such a way
that i
get the results that i want, but not process the returned results,
is that
correct? Is there a way in which i can group the returned results,
and rank
each group separately, and return the results together. In other
words which
component do I need to write to reorder the returned results as per my
requirements.
I'd have a look at what I did for the Clustering patch, i.e.
SOLR-769. It may even be the case that you can simply plugin your own
SolrClusterer or whatever it's called. Or, if it doesn't quite fit
your needs, give me feedback/patch and we can update it. I'm
definitely open to ideas on it.
Also, the deduplication patch seems interesting, but it doesnt
appear to be
expected to work across multiple shards.
Yeah, that does seem a bit tricky. Since Solr doesn't support
distributed indexing, it would be tricky to support just yet.
Regards,
CI
On Thu, Feb 26, 2009 at 8:03 PM, Grant Ingersoll
<gsing...@apache.org>wrote:
On Feb 26, 2009, at 6:04 AM, CIF Search wrote:
We have a distributed index consisting of several shards. There
could be
some documents repeated across shards. We want to remove the
duplicate
records from the documents returned from the shards, and re-order
the
results by grouping them on the basis of a clustering algorithm and
reranking the documents within a cluster on the basis of log of a
particular
returned field value.
I think you would have to implement your own QueryComponent.
However, you
may be able to get away with implementing/using Solr's FunctionQuery
capabilities.
FieldCollapsing is also a likely source of inspiration/help (
http://www.lucidimagination.com/search/?q=Field+Collapsing#/
s:email,issues)
As a side note, have you looked at
http://issues.apache.org/jira/browse/SOLR-769 ?
You might also have a look at the de-duplication patch that is
working it's
way through dev: http://wiki.apache.org/solr/Deduplication
How do we go about achieving this? Should we write this logic by
implementing QueryResponseWriter. Also if we remove duplicate
records, the
total number of records that are actually returned are less than
what were
asked for in the query.
Regards,
CI
--------------------------
Grant Ingersoll
http://www.lucidimagination.com/
Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
using
Solr/Lucene:
http://www.lucidimagination.com/search
--------------------------
Grant Ingersoll
http://www.lucidimagination.com/
Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
using Solr/Lucene:
http://www.lucidimagination.com/search