Hi Sangeetha, Well I don't think you need to commit after every document add.
You can rely on Solr's transaction log feature . If you are using SolrCloud it's mandatory to have a transaction log . So every documents get written to the tlog . Now say a node crashes even if documents were not committed , since it's present in the tlog Solr will replay then on startup. Also if you are using SolrCloud and have multiple replicas , you should use the min_rf feature to make sure that N replicas acknowledge the write before you get back success - https://cwiki.apache.org/confluence/display/solr/Read+and+Write+Side+Fault+Tolerance On Wed, Mar 2, 2016 at 3:41 PM, Emir Arnautovic < emir.arnauto...@sematext.com> wrote: > Hi Sangeetha, > What is sure is that it is not going to work - with 200-300K doc/hour, > there will be >50 commits/second, meaning there are <20ms time for > doc+commit. > You can do is let Solr handle commits and maybe use real time get to > verify doc is in Solr or do some periodic sanity checks. > Are you doing document updates so in order Solr updates are reason why you > commit each doc before moving to next doc? > > Regards, > Emir > > > On 02.03.2016 09:06, sangeetha.subraman...@gtnexus.com wrote: > >> Hi All, >> >> I am trying to understand on how we can have commit issued to solr while >> indexing documents. Around 200K to 300K document/per hour with an avg size >> of 10 KB size each will be getting into SOLR . JAVA code fetches the >> document from MQ and streamlines it to SOLR. The problem is the client code >> issues hard-commit after each document which is sent to SOLR for indexing >> and it waits for the response from SOLR to get assurance whether the >> document got indexed successfully. Only if it gets a OK status from SOLR >> the document is cleared out from SOLR. >> >> As far as I understand doing a commit after each document is an expensive >> operation. But we need to make sure that all the documents which are put >> into MQ gets indexed in SOLR. Is there any other way of getting this done ? >> Please let me know. >> If we do a batch indexing, is there any chances we can identify if some >> documents is missed from indexing ? >> >> Thanks >> Sangeetha >> >> > -- > Monitoring * Alerting * Anomaly Detection * Centralized Log Management > Solr & Elasticsearch Support * http://sematext.com/ > > -- Regards, Varun Thacker