It's possible to get near real-time adds and updates (every two minutes in our case) with a multi-shard setup, if you have a shard dedicated to new content and have the right combination of unique identifiers on your data. I'll respond off-list with a full description of my setup.

On 7/9/2010 4:41 PM, bbarani wrote:
I have a scheduled batch indexing happening in master every 2 days for 3
sources (Ex: s1, s2, s3) Once the batch indexing gets completed I replicate
that to slave instance for user queries.

There is one more app which posts the XML (of s3) to SOLR slave instance (to
perform real time indexing) and the posted XML can add / update document to
the slave index (created by batch indexing). Now since the data posted via
XML is also available for batch indexing, If I do a batch indexing for s3
after 2 days and replicate it in slave users should be able to view all
data. I am posting just to slave first in order to have a kind of real time
indexing where the user can see the results immediately but whenever the XML
post happens to SOLR there is a db entry corresponding to that post..

Now I am afraid that I might run in to an issue when someone kicks off real
time indexing from the app when batch indexing is in progress as the batch
indexing might not pick up the changes made to slave at that time (when the
batch indexing is in progress).

Has anyone faced this kind of scenario..

My ideal solution is that I should be able to do real time (XML post) /
batch indexing at same time and also I cant use shards as real time data may
even need to update the existing index (not just add a new document)..My
assumption is that I can use shards if we are going to maintain index
separately for real time / batch indexing but if I need to update an
existing document using XML post I don't think Shards would work...

I also thought of doing this.. I will always write both XML post / batch
indexing to Master and do a replication to slave every 15 seconds.. even in
this case if I am doing a batch indexing I suppose SOLR will lock the index
files and I wont be able to do a XML push to the same index at that time..
please correct me if I am wrong..

Reply via email to