Hi Sangeetha,

Well I don't think you need to commit after every document add.

You can rely on Solr's transaction log feature . If you are using SolrCloud
it's mandatory to have a transaction log . So every documents get written
to the tlog . Now say a node crashes even if documents were not committed ,
since it's present in the tlog Solr will replay then on startup.

Also if you are using SolrCloud and have multiple replicas , you should use
the min_rf feature to make sure that N replicas acknowledge the write
before you get back success -
https://cwiki.apache.org/confluence/display/solr/Read+and+Write+Side+Fault+Tolerance

On Wed, Mar 2, 2016 at 3:41 PM, Emir Arnautovic <
emir.arnauto...@sematext.com> wrote:

> Hi Sangeetha,
> What is sure is that it is not going to work - with 200-300K doc/hour,
> there will be >50 commits/second, meaning there are <20ms time for
> doc+commit.
> You can do is let Solr handle commits and maybe use real time get to
> verify doc is in Solr or do some periodic sanity checks.
> Are you doing document updates so in order Solr updates are reason why you
> commit each doc before moving to next doc?
>
> Regards,
> Emir
>
>
> On 02.03.2016 09:06, sangeetha.subraman...@gtnexus.com wrote:
>
>> Hi All,
>>
>> I am trying to understand on how we can have commit issued to solr while
>> indexing documents. Around 200K to 300K document/per hour with an avg size
>> of 10 KB size each will be getting into SOLR . JAVA code fetches the
>> document from MQ and streamlines it to SOLR. The problem is the client code
>> issues hard-commit after each document which is sent to SOLR for indexing
>> and it waits for the response from SOLR to get assurance whether the
>> document got indexed successfully. Only if it gets a OK status from SOLR
>> the document is cleared out from SOLR.
>>
>> As far as I understand doing a commit after each document is an expensive
>> operation. But we need to make sure that all the documents which are put
>> into MQ gets indexed in SOLR. Is there any other way of getting this done ?
>> Please let me know.
>> If we do a batch indexing, is there any chances we can identify if some
>> documents is missed from indexing ?
>>
>> Thanks
>> Sangeetha
>>
>>
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>


-- 


Regards,
Varun Thacker

Reply via email to