Hi Sangeetha,
What is sure is that it is not going to work - with 200-300K doc/hour,
there will be >50 commits/second, meaning there are <20ms time for
doc+commit.
You can do is let Solr handle commits and maybe use real time get to
verify doc is in Solr or do some periodic sanity checks.
Are you doing document updates so in order Solr updates are reason why
you commit each doc before moving to next doc?
Regards,
Emir
On 02.03.2016 09:06, sangeetha.subraman...@gtnexus.com wrote:
Hi All,
I am trying to understand on how we can have commit issued to solr while
indexing documents. Around 200K to 300K document/per hour with an avg size of
10 KB size each will be getting into SOLR . JAVA code fetches the document from
MQ and streamlines it to SOLR. The problem is the client code issues
hard-commit after each document which is sent to SOLR for indexing and it waits
for the response from SOLR to get assurance whether the document got indexed
successfully. Only if it gets a OK status from SOLR the document is cleared out
from SOLR.
As far as I understand doing a commit after each document is an expensive
operation. But we need to make sure that all the documents which are put into
MQ gets indexed in SOLR. Is there any other way of getting this done ? Please
let me know.
If we do a batch indexing, is there any chances we can identify if some
documents is missed from indexing ?
Thanks
Sangeetha
--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/