If it makes you feel better, I also considered this approach when I was in the same situation with a separate indexer and searcher on one Physical linux machine.
My main concern was "re-using" the FS cache between both instances - If I replicated to myself there would be two independent copies of the index, FS-cached separately. I like the suggestion of using autoCommit to reload the index. If I'm reading that right, you'd set an autoCommit on 'zero docs changing', or just 'every N seconds'? Did that work? Best of luck! Tim On 5 June 2013 10:19, Roman Chyla <roman.ch...@gmail.com> wrote: > So here it is for a record how I am solving it right now: > > Write-master is started with: -Dmontysolr.warming.enabled=false > -Dmontysolr.write.master=true -Dmontysolr.read.master= > http://localhost:5005 > Read-master is started with: -Dmontysolr.warming.enabled=true > -Dmontysolr.write.master=false > > > solrconfig.xml changes: > > 1. all index changing components have this bit, > enable="${montysolr.master:true}" - ie. > > <updateHandler class="solr.DirectUpdateHandler2" > enable="${montysolr.master:true}"> > > 2. for cache warming de/activation > > <listener event="newSearcher" > class="solr.QuerySenderListener" > enable="${montysolr.enable.warming:true}">... > > 3. to trigger refresh of the read-only-master (from write-master): > > <listener event="postCommit" > class="solr.RunExecutableListener" > enable="${montysolr.master:true}"> > <str name="exe">curl</str> > <str name="dir">.</str> > <bool name="wait">false</bool> > <arr name="args"> <str>${montysolr.read.master:http://localhost > > }/solr/admin/cores?wt=json&action=RELOAD&core=collection1</str></arr> > </listener> > > This works, I still don't like the reload of the whole core, but it seems > like the easiest thing to do now. > > -- roman > > > On Wed, Jun 5, 2013 at 12:07 PM, Roman Chyla <roman.ch...@gmail.com> > wrote: > > > Hi Peter, > > > > Thank you, I am glad to read that this usecase is not alien. > > > > I'd like to make the second instance (searcher) completely read-only, so > I > > have disabled all the components that can write. > > > > (being lazy ;)) I'll probably use > > http://wiki.apache.org/solr/CollectionDistribution to call the curl > after > > commit, or write some IndexReaderFactory that checks for changes > > > > The problem with calling the 'core reload' - is that it seems lots of > work > > for just opening a new searcher, eeekkk...somewhere I read that it is > cheap > > to reload a core, but re-opening the index searches must be definitely > > cheaper... > > > > roman > > > > > > On Wed, Jun 5, 2013 at 4:03 AM, Peter Sturge <peter.stu...@gmail.com > >wrote: > > > >> Hi, > >> We use this very same scenario to great effect - 2 instances using the > >> same > >> dataDir with many cores - 1 is a writer (no caching), the other is a > >> searcher (lots of caching). > >> To get the searcher to see the index changes from the writer, you need > the > >> searcher to do an empty commit - i.e. you invoke a commit with 0 > >> documents. > >> This will refresh the caches (including autowarming), [re]build the > >> relevant searchers etc. and make any index changes visible to the RO > >> instance. > >> Also, make sure to use <lockType>native</lockType> in solrconfig.xml to > >> ensure the two instances don't try to commit at the same time. > >> There are several ways to trigger a commit: > >> Call commit() periodically within your own code. > >> Use autoCommit in solrconfig.xml. > >> Use an RPC/IPC mechanism between the 2 instance processes to tell the > >> searcher the index has changed, then call commit when called (more > complex > >> coding, but good if the index changes on an ad-hoc basis). > >> Note, doing things this way isn't really suitable for an NRT > environment. > >> > >> HTH, > >> Peter > >> > >> > >> > >> On Tue, Jun 4, 2013 at 11:23 PM, Roman Chyla <roman.ch...@gmail.com> > >> wrote: > >> > >> > Replication is fine, I am going to use it, but I wanted it for > instances > >> > *distributed* across several (physical) machines - but here I have one > >> > physical machine, it has many cores. I want to run 2 instances of solr > >> > because I think it has these benefits: > >> > > >> > 1) I can give less RAM to the writer (4GB), and use more RAM for the > >> > searcher (28GB) > >> > 2) I can deactivate warming for the writer and keep it for the > searcher > >> > (this considerably speeds up indexing - each time we commit, the > server > >> is > >> > rebuilding a citation network of 80M edges) > >> > 3) saving disk space and better OS caching (OS should be able to use > >> more > >> > RAM for the caching, which should result in faster operations - the > two > >> > processes are accessing the same index) > >> > > >> > Maybe I should just forget it and go with the replication, but it > >> doesn't > >> > 'feel right' IFF it is on the same physical machine. And Lucene > >> > specifically has a method for discovering changes and re-opening the > >> index > >> > (DirectoryReader.openIfChanged) > >> > > >> > Am I not seeing something? > >> > > >> > roman > >> > > >> > > >> > > >> > On Tue, Jun 4, 2013 at 5:30 PM, Jason Hellman < > >> > jhell...@innoventsolutions.com> wrote: > >> > > >> > > Roman, > >> > > > >> > > Could you be more specific as to why replication doesn't meet your > >> > > requirements? It was geared explicitly for this purpose, including > >> the > >> > > automatic discovery of changes to the data on the index master. > >> > > > >> > > Jason > >> > > > >> > > On Jun 4, 2013, at 1:50 PM, Roman Chyla <roman.ch...@gmail.com> > >> wrote: > >> > > > >> > > > OK, so I have verified the two instances can run alongside, > sharing > >> the > >> > > > same datadir > >> > > > > >> > > > All update handlers are unaccessible in the read-only master > >> > > > > >> > > > <updateHandler class="solr.DirectUpdateHandler2" > >> > > > enable="${solr.can.write:true}"> > >> > > > > >> > > > java -Dsolr.can.write=false ..... > >> > > > > >> > > > And I can reload the index manually: > >> > > > > >> > > > curl " > >> > > > > >> > > > >> > > >> > http://localhost:5005/solr/admin/cores?wt=json&action=RELOAD&core=collection1 > >> > > > " > >> > > > > >> > > > But this is not an ideal solution; I'd like for the read-only > >> server to > >> > > > discover index changes on its own. Any pointers? > >> > > > > >> > > > Thanks, > >> > > > > >> > > > roman > >> > > > > >> > > > > >> > > > On Tue, Jun 4, 2013 at 2:01 PM, Roman Chyla < > roman.ch...@gmail.com> > >> > > wrote: > >> > > > > >> > > >> Hello, > >> > > >> > >> > > >> I need your expert advice. I am thinking about running two > >> instances > >> > of > >> > > >> solr that share the same datadirectory. The *reason* being: > >> indexing > >> > > >> instance is constantly building cache after every commit (we > have a > >> > big > >> > > >> cache) and this slows it down. But indexing doesn't need much > RAM, > >> > only > >> > > the > >> > > >> search does (and server has lots of CPUs) > >> > > >> > >> > > >> So, it is like having two solr instances > >> > > >> > >> > > >> 1. solr-indexing-master > >> > > >> 2. solr-read-only-master > >> > > >> > >> > > >> In the solrconfig.xml I can disable update components, It should > be > >> > > fine. > >> > > >> However, I don't know how to 'trigger' index re-opening on (2) > >> after > >> > the > >> > > >> commit happens on (1). > >> > > >> > >> > > >> Ideally, the second instance could monitor the disk and re-open > >> disk > >> > > after > >> > > >> new files appear there. Do I have to implement custom > >> > > IndexReaderFactory? > >> > > >> Or something else? > >> > > >> > >> > > >> Please note: I know about the replication, this usecase is IMHO > >> > slightly > >> > > >> different - in fact, write-only-master (1) is also a replication > >> > master > >> > > >> > >> > > >> Googling turned out only this > >> > > >> > >> http://comments.gmane.org/gmane.comp.jakarta.lucene.solr.user/71912 - > >> > > no > >> > > >> pointers there. > >> > > >> > >> > > >> But If I am approaching the problem wrongly, please don't > hesitate > >> to > >> > > >> 're-educate' me :) > >> > > >> > >> > > >> Thanks! > >> > > >> > >> > > >> roman > >> > > >> > >> > > > >> > > > >> > > >> > > > > >