On 1/26/2018 4:23 AM, Vincenzo D'Amore wrote: > The first client does the following: > > 1. rollbacks all add/deletes made to the index since the last commit (in > case previous client execution was completed unsuccessfully). > 2. reads data from sql server > 3. updates solr documents > 4. manually commits > > And *important*, once a day, the first client deletes all the existing > documents and reindex the entire collection from scratch. > > The second client is simpler, it manually commits after every atomic update.
The fact that one client is deleting everything and reindexing changes the landscape dramatically. Since I do not know anything about your setup, I'll make up a similar scenario and describe what I see as the potential problems. Let's say that this theoretical index contains one million documents. A full reindex of this index takes 2 hours and starts at midnight. While the reindex is happening, the first client doesn't do "normal" updates. The second client runs every ten minutes (x:00, x:10, etc), and is completely unaware of what the first client is doing. At 12:01 AM, the full delete has happened to the "under construction" version of the index, and the reindex has been running for one minute. Everything is fine, anyone searching will have the full index available. At 12:10 AM, let's imagine that the second client is going to update one document with the atomic update feature. If the full reindex has indexed that document, this will work, but if it hasn't, the atomic update is going to fail. For the purposes of this scenario, let's assume that the atomic update succeeds, and the second client does its commit. When the second client's commit finishes, the index will have a little over 80000 documents in it, instead of one million, because all the documents were deleted and the reindex is only about eight percent complete. The same thing would also happen when autoSoftCommit gets triggered after an update, if autoSoftCommit is configured. If the second client can be paused while the first client is reindexing, and you don't configure autoSoftCommit, then everything will be fine. But if the second client does its work while the reindex is underway, there will be problems. Separate side issue: The fact that your first client does rollbacks could potentially roll back changes made by the second client, unless you can guarantee that the second client will wait until the first client is idle. Thanks, Shawn