scanning all documents in the collection

Matteo Grolla Mon, 02 Feb 2015 05:16:36 -0800

Hi,
        I'm thinking about having an instance of solr (SolrA) with all fields 
stored and just id indexed in addition with a normal production instance of 
solr (SolrB) that is used for the searches.
This would allow me to read only what changed from previous crawl, update SolrA 
and send the full document to SolrB. Without forcing SolrB to have all fields 
stored.
In addition I have some batch jobs that work on the whole collection and making 
them work on SolrA would allow me to detect the document that changed and 
submit only those to SolrB.
The point is that to run this job I'll need to scan through all documents from 
SolrA, I'll query on *:* and then go through all pages, which is not the 
typical usage of Solr.
SolrA will contain a few tens of GB of data coming from hundreds of thousands 
docs.
Do you think I'm gonna run into troubles using Solr this way?
I'd like to use Solr (for SolrA) for ease of maintenance, because Sys admin are 
already trained with Solr


thanks

scanning all documents in the collection

Reply via email to