so are "core" and "corebak" pointing to the same datadir or do you have the indexing solr instance keep writing to a new directory?
On Fri, May 26, 2017 at 1:53 PM, Robert Haschart <rh...@virginia.edu> wrote: > The process we use to signal the read-only servers, is to submit a CREATE > request pointing to the newly created index, with a name like corebak, then > doing a SWAP request between core and corebak, then submit an UNLOAD > request for the corebak which is now pointing at the previous version. > > The individual servers cannot do a merge on their own, since they mount > the NAS read-only. Nothing they can do will affect the index. I believe > this allows each machine to cache much of the index in memory, with no fear > that their cache will be made invalid by one of the others. > > -Bob Haschart > University of Virginia Library > > > > On 5/26/2017 12:52 PM, David Hastings wrote: > >> Im curious about this. when you say "and signal the three Solr servers >> when the updated index is available. " how does it send the signal? IE >> what command, just a reload? Also what prevents them from doing a merge >> on >> their own? Thanks >> >> On Fri, May 26, 2017 at 12:09 PM, Robert Haschart <rh...@virginia.edu> >> wrote: >> >> We have run using this exact scenario for several years. We have three >>> Solr servers sitting behind a load balancer, with all three accessing the >>> same Solr index stored on read-only network addressable storage. A >>> fourth >>> machine is used to update the index (typically daily) and signal the >>> three >>> Solr servers when the updated index is available. Our index is >>> primarily >>> bibliographic information and it contains about 8 million documents and >>> is >>> about 30GB in size. We've used this configuration since before >>> Zookeeper >>> and Cloud-based Solr or even java-based master slave replication were >>> available. I cannot say whether this configuration has any benefits >>> over >>> the current accepted way of load-balancing, but it has worked well for us >>> for several years and we've never had a corrupted index problem. >>> >>> >>> -Bob Haschart >>> University of Virginia Library >>> >>> >>> >>> On 5/23/2017 10:05 PM, Shawn Heisey wrote: >>> >>> On 5/19/2017 8:33 AM, Ravi Kumar Taminidi wrote: >>>> >>>> Hello, Scenario: Currently we have 2 Solr Servers running in 2 >>>>> different servers (linux), Is there any way can we make the Core to be >>>>> located in NAS or Network shared Drive so both the solrs using the same >>>>> Index. >>>>> >>>>> Let me know if any performance issues, our size of Index is appx 1GB. >>>>> >>>>> I think it's a very bad idea to try to share indexes between multiple >>>> Solr instances. You can override the locking and get it to work, and >>>> you may be able to find advice on the Internet about how to do it. I >>>> can tell you that it's outside the design intent for both Lucene and >>>> Solr. Lucene works aggressively to *prevent* multiple processes from >>>> sharing an index. >>>> >>>> In general, network storage is not a good idea for Solr. There's added >>>> latency for accessing any data, and frequently the filesystem won't >>>> support the kind of locking that Lucene wants to use, but the biggest >>>> potential problem is disk caching. Solr/Lucene is absolutely reliant on >>>> disk caching in the SOlr server's local memory for good performance. If >>>> the network filesystem cannot be cached by the client that has mounted >>>> the storage, which I believe is the case for most network filesystem >>>> types, then you're reliant on disk caching in the network server(s). >>>> For VERY large indexes, which is really the only viable use case I can >>>> imagine for network storage, it is highly unlikely that the network >>>> server(s) will have enough memory to effectively cache the data. >>>> >>>> Solr has explicit support for HDFS storage, but as I understand it, HDFS >>>> includes the ability for a client to allocate memory that gets used >>>> exclusively for caching on the client side, which allows HDFS to >>>> function like a local filesystem in ways that I don't think NFS can. >>>> Getting back to my advice about not sharing indexes -- even with >>>> SolrCloud on HDFS, multiple replicas generally do NOT share an index. >>>> >>>> A 1GB index is very small, so there's no good reason I can think of to >>>> involve network storage. I would strongly recommend local storage, and >>>> you should abandon any attempt to share the same index data between more >>>> than one Solr instance. >>>> >>>> Thanks, >>>> Shawn >>>> >>>> >>>> >