On Tue, Oct 11, 2011 at 6:55 PM, Brandon Ramirez < brandon_rami...@elementk.com> wrote:
> Using a shared volume crossed my mind too, but I discarded the idea because > of literature I have read about Lucene performing poorly against remote file > systems. But then I suppose a SAN wouldn't be a remote file system in the > same sense as an NFS-mounted NAS or similar. > I have had one major customer who has tested Solr on MapR's via an NFS export and they reported very good results. I think that the common wisdom of "don't use Lucene on NFS" comes from two sources, but now somehwhat dated: - the old problem with Lucene assuming that an open file would not be deleted. This has been fixed approximately forever. - crummy NFS servers give results that are, well, crummy. Should I be concerned about two solr instances on two machines having the > same SAN-based index open, as long as only one of them is receiving > requests? > I think that you would want to reopen the index whenever there is a master switch. If you aren't doing updates, then both can open the same index. If one is doing updates, then you can use the second for searches as long as you make sure that the indexer has the parameters set correctly to not delete old files for a period long enough to allow the second to reopen the new version of the index. If you are using a system like MapR (or even a NetApp), then you can use snapshots to avoid deletions. With a setup like that, you would have the reader switch to a new snapshot each time one appears and set the writer directory to snapshot every minute or few. Whether that fits your need is a different question, of course. I would strongly recommend coordinating who is writer and who is not using a system like a leader election in Zookeeper.