Hi Everyone, I am wondering if there is any best practice regarding re-indexing documents in SolrCloud 6.0.0 without making the data (or the underlying collection) temporarily unavailable. Wiping all documents in a collection and performing a full re-indexing is not a viable alternative for us.
Say we had a massive Solr Cloud cluster with a number of separate nodes that are used to host *multiple hundreds* of collections, with document counts ranging from a couple of thousands to multiple (say up to 20) millions of documents, each with 200-300 fields and a background batch loader job that fetches data from a variety of source systems. We have to retain the cluster and ALL collections online all the time (365 x 24): We cannot allow queries to be blocked while data in a collection is being updated and we cannot load everything in a single-shot jumbo commit (the replication could overload the cluster). One solution I could imagine is storing an additional field "load time-stamp" in all documents and the client (interactive query) application extending all queries with an additional restriction, which requires documents "load time-stamp" to be the latest known completed "load time-stamp". This concept would work according to the following: 1.) The batch job would simply start loading new documents, with the new "load time-stamp". Existing documents would not be touched. 2.) The client (interactive query) application would still use the old data from the previous load (since all queries are restricted with the old "load time-stamp") 3.) The batch job would store the new "load time-stamp" as the one to be used (e.g. in a separate collection etc.) -- after this, all queries would return the most up-to-data documents 4.) The batch job would purge all documents from the collection, where the "load time-stamp" is not the same as the last one. This approach seems to be implementable, however, I definitely want to avoid reinventing the wheel myself and wondering if there is any better solution or built-in Solr Cloud feature to achieve the same or something similar. Thanks, Peter