We all need example data, and a sample query to help you.
You can use "group" to group by a field and remove dupes.
If you want to remove dupes you can do something like:
q=field1:DOG AND NOT field2:DOG AND NOT field3:DOG
That will remove DOG from field2 or field3.
If you don't care if it is i
Thank you so much Alex and Joel for your ideas. I am pouring through the
documentation and code now to try an understand it all. A post filter sounds
promising. As 99% of my doc fields are character based I should try to
compliment the collapsing Q parser with an option that compares string
fields
As insane as it sounds, I need to process all the results. No one document is
more or less important than another. Only a few hundred "unique" docs will
be sent to the client at any one time, but the users expect to page through
them all.
I don't expect sub-second performance for this task. I'm ju
Do you have a sense of what your typical queries would look like? I mean,
maybe you wouldn't actually need to fetch more than a tiny fraction of
those million documents. Do you only need to determine the top 10 or 20 or
50 unique field value row sets, or do you need to determine ALL unique row
sets
You may also want to take a look at how AnalyticsQueries can be plugged in.
This won't show you how to do the implementation but it will show you how
you can plugin a custom collector.
http://heliosearch.org/solrs-new-analyticsquery-api/
http://heliosearch.org/solrs-mergestrategy/
Joel Bernstein
Sounds like:
https://cwiki.apache.org/confluence/display/solr/Collapse+and+Expand+Results
http://heliosearch.org/the-collapsingqparserplugin-solrs-new-high-performance-field-collapsing-postfilter/
The main issue is your multi-field criteria. So you may need to
extend/overwrite the comparison metho