Hi,

This "search across multiple collections" question has come up a few
times recently:

http://search-lucene.com/m/2Q1BE0IT4Y/&subj=Search+across+multiple+collections
http://search-lucene.com/m/5JQrXIyhQQ1/&subj=Querying+multiple+collections+in+SolrCloud

One important variation of this Q is - can one search across MULTIPLE
IDENTICAL collections.

The use case is that you need to index/archive a lot of data, but
because your searches have a time range filter, instead of having 1
massive Collection you have to search, you really want to have N
smaller Collection, say weekly, so you can search smaller
Collection(s).

For example:
A query that limits matches to docs from only the last 48 hours can be
routed only to the Collection for the latest/current week.
If the time range filter needs data from multiple Collections (e.g.
it's for the last 10 days and we have weekly collections), then
IDEALLY, you want to be able to send ONE request to Solr and specify 2
Collections to search and have Solr handle calling each Collection and
merging.

Yes, in case of full-text search global IDF would ideally be used, but
Solr is increasingly used for analytical queries and not just
full-text queries, and one doesn't need global IDF for that.

So: Can one query *multiple identical* Collections with one request
from the client?
If not: should I open a new JIRA issue?

I see https://issues.apache.org/jira/browse/SOLR-4497 allows aliasing
multiple Collections, which covers the use-case where you know which
Collections might be queried.  But in some cases you don't know that
ahead of time, so you can't prepare all the aliases.  In that case you
wold want to be able to list all Collections to search in the request
and that's it.

Maybe this is already doable?

Thanks,
Otis
--
Solr & ElasticSearch Support -- http://sematext.com/
Performance Monitoring -- http://sematext.com/spm

Reply via email to