Good morning I have the following situation I have to index the OCR of about 550,000 pages of newspapers counting an average of 3,500 words per page and making a document per word the records are many.
At the moment I have 1 instance of Solr and 8 servers that read and write all on the same instance at the same time, at the beginning everything is fine after a while when I add, delete or commit it gives me a TimeOut error towards the solr server. I suspect the problem is due to the fact that it is that I do many commit operations of many docs at a time (practically if the newspaper is 30 pages I do 105,000 add and in the end I commit), if everyone does this and 8 servers within walking distance of each other I think this creates problems for Solr. What can I do to solve the problem? Do I make a commi to each add? Is it possible to configure the solr server to apply the add and delete commands, and to commit it, the server autonomously supports the available resources as it seems to do for the optmized command? Reading the documentation I would have found this configuration to implement but not if it solves my problem <deletionPolicy class="solr.SolrDeletionPolicy"> <str name="maxCommitsToKeep">1</str> <str name="maxOptimizedCommitsToKeep">0</str> <str name="maxCommitAge">1DAY</str></deletionPolicy><infoStream>false</infoStream> Thanks for your consideration Massimiliano Randazzo