In my case, I want to filter out "duplicate" docs so that returned
docs are unique w/ respect to a certain field (not the schema's unique
field, of course): a "duplicate" doc here is one that has same value
for a checksum field as one of the docs already in the results. It
would be great if I could somehow express that w/ a query, but I don't
think that would be possible.

On Thu, Feb 24, 2011 at 5:11 PM, Jonathan Rochkind <rochk...@jhu.edu> wrote:
> Hmm, depending on what you are actually needing to do, can you do it with a 
> simple fq param to filter out what you want filtered out, instead of needing 
> to write custom Java as you are suggesting? It would be a lot easier to just 
> use an fq.
>
> How would you describe the documents you want to filter from the query 
> results page?  Can that description be represented by a Solr query you can 
> already represent using the lucene, dismax, or any other existing query? If 
> so, why not just use a negated fq describing what to omit from the results?
> ________________________________________
> From: Babak Farhang [farh...@gmail.com]
> Sent: Thursday, February 24, 2011 6:58 PM
> To: solr-user
> Subject: query results filter
>
> Hi everyone,
>
> I have some existing solr cores that for one reason or another have
> documents that I need to filter from the query results page.
>
> I would like to do this inside Solr instead of doing it on the
> receiving end, in the client.  After searching the mailing list
> archives and Solr wiki, it appears you do this by registering a custom
> SearchHandler / SearchComponent with Solr.  Still, I don't quite
> understand how this machinery fits together.  Any suggestions / ideas
> / pointers much appreciated!
>
> Cheers,
> -Babak
>
> ~~
>
> Ideally, I'd like to find / code a solution that does the following:
>
> 1. A request handler that works like the StandardRequestHandler but
> which allows an optional DocFilter (say, modeled like the
> java.io.FileFilter interface)
> 2. Allows current pagination to work transparently.
> 3. Works transparently with distributed/sharded queries.
>

Reply via email to