Hi All, I have few questions about SolrCloud and how could behave in an environment where there are more concurrent clients updating the same collection.
We have a SolrCloud 4.8.1 collection that stores a catalog of millions of products (index size about 20GB). Actually there is only one SolrJ client committing all the modifications, this client takes care of update all product descriptions, attributes, prices, availabilities, etc. And every few minutes this client submit a group of documents (thousands or more) that have to be updated. This is the current updateHandler configuration: <updateHandler class="solr.DirectUpdateHandler2"> <autoCommit> <maxTime>300000</maxTime> <openSearcher>false</openSearcher> </autoCommit> And things works well even when there is an high amount of users searching. But in order to have prices updates as soon as possible, we're planning to add a second client that, even while the first client is running, should submit many prices atomic updates. Now I'm worried about to have two clients on the same collection, even if those clients can be orchestrated using a kind of semaphore, I'm afraid that those atomic commits could come too quickly or in worst case might even overlap the other (first) client. As far as I read, continuous commits could dangerously slow down the performance of the search engine. In case commits between the two clients are overlapped, this could even compromise the collections integrity, given there is no transaction isolation in Solr. To be clear, what happens if the second client does an atomic update while the first client is doing a full delete and re-indexing of the entire collection? My idea is that is better have always only one client that update the collection, may be using near real time indexing, but always only one client. Could please anyone confirm my concerns or there is a workaround I have not considered in order to have more clients? Best regards, Vincenzo -- Vincenzo D'Amore