If I index a bunch of email documents, is there a way to say"show me all
email documents, but only one per To: email address"
so that if there are a total of 10 distinct To: fields in the corpus, I get
back 10 email documents?

I'm aware of http://wiki.apache.org/solr/Deduplication but I want to retain
the ability to search across all of my email documents most of the time, and
only occasionally search for the distinct ones.

Essentially I want to do a
SELECT DISTINCT to_field FROM documents
where a normal search is a
SELECT * FROM documents

Thanks for any pointers.

Reply via email to