You can do a reload, yes, but a commit() is considerably faster.

On Tue, Jul 2, 2013 at 10:35 PM, Michael Della Bitta <
michael.della.bi...@appinions.com> wrote:

> Wouldn't it be better to do a RELOAD?
>
> http://wiki.apache.org/solr/CoreAdmin#RELOAD
>
> Michael Della Bitta
>
> Applications Developer
>
> o: +1 646 532 3062  | c: +1 917 477 7906
>
> appinions inc.
>
> “The Science of Influence Marketing”
>
> 18 East 41st Street
>
> New York, NY 10017
>
> t: @appinions <https://twitter.com/Appinions> | g+:
> plus.google.com/appinions
> w: appinions.com <http://www.appinions.com/>
>
>
> On Tue, Jul 2, 2013 at 5:05 PM, Peter Sturge <peter.stu...@gmail.com>
> wrote:
>
> > The RO instance commit isn't (or shouldn't be) doing any real writing,
> just
> > an empty commit to force new searchers, autowarm/refresh caches etc.
> > Admittedly, we do all this on 3.6, so 4.0 could have different behaviour
> in
> > this area.
> > As long as you don't have autocommit in solrconfig.xml, there wouldn't be
> > any commits 'behind the scenes' (we do all our commits via a local solrj
> > client so it can be fully managed).
> > The only caveat might be NRT/soft commits, but I'm not too familiar with
> > this in 4.0.
> > In any case, your RO instance must be getting updated somehow, otherwise
> > how would it know your write instance made any changes?
> > Perhaps your write instance notifies the RO instance externally from
> Solr?
> > (a perfectly valid approach, and one that would allow a 'single' lock to
> > work without contention)
> >
> >
> >
> > On Tue, Jul 2, 2013 at 7:59 PM, Roman Chyla <roman.ch...@gmail.com>
> wrote:
> >
> > > Interesting, we are running 4.0 - and solr will refuse the start (or
> > > reload) the core. But from looking at the code I am not seeing it is
> > doing
> > > any writing - but I should digg more...
> > >
> > > Are you sure it needs to do writing? Because I am not calling commits,
> in
> > > fact I have deactivated *all* components that write into index, so
> unless
> > > there is something deep inside, which automatically calls the commit,
> it
> > > should never happen.
> > >
> > > roman
> > >
> > >
> > > On Tue, Jul 2, 2013 at 2:54 PM, Peter Sturge <peter.stu...@gmail.com>
> > > wrote:
> > >
> > > > Hmmm, single lock sounds dangerous. It probably works ok because
> you've
> > > > been [un]lucky.
> > > > For example, even with a RO instance, you still need to do a commit
> in
> > > > order to reload caches/changes from the other instance.
> > > > What happens if this commit gets called in the middle of the other
> > > > instance's commit? I've not tested this scenario, but it's very
> > possible
> > > > with a 'single' lock the results are indeterminate.
> > > > If the 'single' lock mechanism is making assumptions e.g. no other
> > > process
> > > > will interfere, and then one does, the Lucene index could very well
> get
> > > > corrupted.
> > > >
> > > > For the error you're seeing using 'native', we use native lockType
> for
> > > both
> > > > write and RO instances, and it works fine - no contention.
> > > > Which version of Solr are you using? Perhaps there's been a change in
> > > > behaviour?
> > > >
> > > > Peter
> > > >
> > > >
> > > > On Tue, Jul 2, 2013 at 7:30 PM, Roman Chyla <roman.ch...@gmail.com>
> > > wrote:
> > > >
> > > > > as i discovered, it is not good to use 'native' locktype in this
> > > > scenario,
> > > > > actually there is a note in the solrconfig.xml which says the same
> > > > >
> > > > > when a core is reloaded and solr tries to grab lock, it will fail -
> > > even
> > > > if
> > > > > the instance is configured to be read-only, so i am using 'single'
> > lock
> > > > for
> > > > > the readers and 'native' for the writer, which seems to work OK
> > > > >
> > > > > roman
> > > > >
> > > > >
> > > > > On Fri, Jun 7, 2013 at 9:05 PM, Roman Chyla <roman.ch...@gmail.com
> >
> > > > wrote:
> > > > >
> > > > > > I have auto commit after 40k RECs/1800secs. But I only tested
> with
> > > > manual
> > > > > > commit, but I don't see why it should work differently.
> > > > > > Roman
> > > > > > On 7 Jun 2013 20:52, "Tim Vaillancourt" <t...@elementspace.com>
> > > wrote:
> > > > > >
> > > > > >> If it makes you feel better, I also considered this approach
> when
> > I
> > > > was
> > > > > in
> > > > > >> the same situation with a separate indexer and searcher on one
> > > > Physical
> > > > > >> linux machine.
> > > > > >>
> > > > > >> My main concern was "re-using" the FS cache between both
> > instances -
> > > > If
> > > > > I
> > > > > >> replicated to myself there would be two independent copies of
> the
> > > > index,
> > > > > >> FS-cached separately.
> > > > > >>
> > > > > >> I like the suggestion of using autoCommit to reload the index.
> If
> > > I'm
> > > > > >> reading that right, you'd set an autoCommit on 'zero docs
> > changing',
> > > > or
> > > > > >> just 'every N seconds'? Did that work?
> > > > > >>
> > > > > >> Best of luck!
> > > > > >>
> > > > > >> Tim
> > > > > >>
> > > > > >>
> > > > > >> On 5 June 2013 10:19, Roman Chyla <roman.ch...@gmail.com>
> wrote:
> > > > > >>
> > > > > >> > So here it is for a record how I am solving it right now:
> > > > > >> >
> > > > > >> > Write-master is started with:
> -Dmontysolr.warming.enabled=false
> > > > > >> > -Dmontysolr.write.master=true -Dmontysolr.read.master=
> > > > > >> > http://localhost:5005
> > > > > >> > Read-master is started with: -Dmontysolr.warming.enabled=true
> > > > > >> > -Dmontysolr.write.master=false
> > > > > >> >
> > > > > >> >
> > > > > >> > solrconfig.xml changes:
> > > > > >> >
> > > > > >> > 1. all index changing components have this bit,
> > > > > >> > enable="${montysolr.master:true}" - ie.
> > > > > >> >
> > > > > >> > <updateHandler class="solr.DirectUpdateHandler2"
> > > > > >> >                  enable="${montysolr.master:true}">
> > > > > >> >
> > > > > >> > 2. for cache warming de/activation
> > > > > >> >
> > > > > >> > <listener event="newSearcher"
> > > > > >> >       class="solr.QuerySenderListener"
> > > > > >> >       enable="${montysolr.enable.warming:true}">...
> > > > > >> >
> > > > > >> > 3. to trigger refresh of the read-only-master (from
> > write-master):
> > > > > >> >
> > > > > >> >     <listener event="postCommit"
> > > > > >> >       class="solr.RunExecutableListener"
> > > > > >> >       enable="${montysolr.master:true}">
> > > > > >> >       <str name="exe">curl</str>
> > > > > >> >       <str name="dir">.</str>
> > > > > >> >       <bool name="wait">false</bool>
> > > > > >> >       <arr name="args"> <str>${montysolr.read.master:
> > > > http://localhost
> > > > > >> >
> > > > > >> >
> > > > > >>
> > > > >
> > > >
> > >
> >
> }/solr/admin/cores?wt=json&amp;action=RELOAD&amp;core=collection1</str></arr>
> > > > > >> >     </listener>
> > > > > >> >
> > > > > >> > This works, I still don't like the reload of the whole core,
> but
> > > it
> > > > > >> seems
> > > > > >> > like the easiest thing to do now.
> > > > > >> >
> > > > > >> > -- roman
> > > > > >> >
> > > > > >> >
> > > > > >> > On Wed, Jun 5, 2013 at 12:07 PM, Roman Chyla <
> > > roman.ch...@gmail.com
> > > > >
> > > > > >> > wrote:
> > > > > >> >
> > > > > >> > > Hi Peter,
> > > > > >> > >
> > > > > >> > > Thank you, I am glad to read that this usecase is not alien.
> > > > > >> > >
> > > > > >> > > I'd like to make the second instance (searcher) completely
> > > > > read-only,
> > > > > >> so
> > > > > >> > I
> > > > > >> > > have disabled all the components that can write.
> > > > > >> > >
> > > > > >> > > (being lazy ;)) I'll probably use
> > > > > >> > > http://wiki.apache.org/solr/CollectionDistribution to call
> > the
> > > > curl
> > > > > >> > after
> > > > > >> > > commit, or write some IndexReaderFactory that checks for
> > changes
> > > > > >> > >
> > > > > >> > > The problem with calling the 'core reload' - is that it
> seems
> > > lots
> > > > > of
> > > > > >> > work
> > > > > >> > > for just opening a new searcher, eeekkk...somewhere I read
> > that
> > > it
> > > > > is
> > > > > >> > cheap
> > > > > >> > > to reload a core, but re-opening the index searches must be
> > > > > definitely
> > > > > >> > > cheaper...
> > > > > >> > >
> > > > > >> > > roman
> > > > > >> > >
> > > > > >> > >
> > > > > >> > > On Wed, Jun 5, 2013 at 4:03 AM, Peter Sturge <
> > > > > peter.stu...@gmail.com
> > > > > >> > >wrote:
> > > > > >> > >
> > > > > >> > >> Hi,
> > > > > >> > >> We use this very same scenario to great effect - 2
> instances
> > > > using
> > > > > >> the
> > > > > >> > >> same
> > > > > >> > >> dataDir with many cores - 1 is a writer (no caching), the
> > other
> > > > is
> > > > > a
> > > > > >> > >> searcher (lots of caching).
> > > > > >> > >> To get the searcher to see the index changes from the
> writer,
> > > you
> > > > > >> need
> > > > > >> > the
> > > > > >> > >> searcher to do an empty commit - i.e. you invoke a commit
> > with
> > > 0
> > > > > >> > >> documents.
> > > > > >> > >> This will refresh the caches (including autowarming),
> > [re]build
> > > > the
> > > > > >> > >> relevant searchers etc. and make any index changes visible
> to
> > > the
> > > > > RO
> > > > > >> > >> instance.
> > > > > >> > >> Also, make sure to use <lockType>native</lockType> in
> > > > > solrconfig.xml
> > > > > >> to
> > > > > >> > >> ensure the two instances don't try to commit at the same
> > time.
> > > > > >> > >> There are several ways to trigger a commit:
> > > > > >> > >> Call commit() periodically within your own code.
> > > > > >> > >> Use autoCommit in solrconfig.xml.
> > > > > >> > >> Use an RPC/IPC mechanism between the 2 instance processes
> to
> > > tell
> > > > > the
> > > > > >> > >> searcher the index has changed, then call commit when
> called
> > > > (more
> > > > > >> > complex
> > > > > >> > >> coding, but good if the index changes on an ad-hoc basis).
> > > > > >> > >> Note, doing things this way isn't really suitable for an
> NRT
> > > > > >> > environment.
> > > > > >> > >>
> > > > > >> > >> HTH,
> > > > > >> > >> Peter
> > > > > >> > >>
> > > > > >> > >>
> > > > > >> > >>
> > > > > >> > >> On Tue, Jun 4, 2013 at 11:23 PM, Roman Chyla <
> > > > > roman.ch...@gmail.com>
> > > > > >> > >> wrote:
> > > > > >> > >>
> > > > > >> > >> > Replication is fine, I am going to use it, but I wanted
> it
> > > for
> > > > > >> > instances
> > > > > >> > >> > *distributed* across several (physical) machines - but
> > here I
> > > > > have
> > > > > >> one
> > > > > >> > >> > physical machine, it has many cores. I want to run 2
> > > instances
> > > > of
> > > > > >> solr
> > > > > >> > >> > because I think it has these benefits:
> > > > > >> > >> >
> > > > > >> > >> > 1) I can give less RAM to the writer (4GB), and use more
> > RAM
> > > > for
> > > > > >> the
> > > > > >> > >> > searcher (28GB)
> > > > > >> > >> > 2) I can deactivate warming for the writer and keep it
> for
> > > the
> > > > > >> > searcher
> > > > > >> > >> > (this considerably speeds up indexing - each time we
> > commit,
> > > > the
> > > > > >> > server
> > > > > >> > >> is
> > > > > >> > >> > rebuilding a citation network of 80M edges)
> > > > > >> > >> > 3) saving disk space and better OS caching (OS should be
> > able
> > > > to
> > > > > >> use
> > > > > >> > >> more
> > > > > >> > >> > RAM for the caching, which should result in faster
> > > operations -
> > > > > the
> > > > > >> > two
> > > > > >> > >> > processes are accessing the same index)
> > > > > >> > >> >
> > > > > >> > >> > Maybe I should just forget it and go with the
> replication,
> > > but
> > > > it
> > > > > >> > >> doesn't
> > > > > >> > >> > 'feel right' IFF it is on the same physical machine. And
> > > Lucene
> > > > > >> > >> > specifically has a method for discovering changes and
> > > > re-opening
> > > > > >> the
> > > > > >> > >> index
> > > > > >> > >> > (DirectoryReader.openIfChanged)
> > > > > >> > >> >
> > > > > >> > >> > Am I not seeing something?
> > > > > >> > >> >
> > > > > >> > >> > roman
> > > > > >> > >> >
> > > > > >> > >> >
> > > > > >> > >> >
> > > > > >> > >> > On Tue, Jun 4, 2013 at 5:30 PM, Jason Hellman <
> > > > > >> > >> > jhell...@innoventsolutions.com> wrote:
> > > > > >> > >> >
> > > > > >> > >> > > Roman,
> > > > > >> > >> > >
> > > > > >> > >> > > Could you be more specific as to why replication
> doesn't
> > > meet
> > > > > >> your
> > > > > >> > >> > > requirements?  It was geared explicitly for this
> purpose,
> > > > > >> including
> > > > > >> > >> the
> > > > > >> > >> > > automatic discovery of changes to the data on the index
> > > > master.
> > > > > >> > >> > >
> > > > > >> > >> > > Jason
> > > > > >> > >> > >
> > > > > >> > >> > > On Jun 4, 2013, at 1:50 PM, Roman Chyla <
> > > > roman.ch...@gmail.com
> > > > > >
> > > > > >> > >> wrote:
> > > > > >> > >> > >
> > > > > >> > >> > > > OK, so I have verified the two instances can run
> > > alongside,
> > > > > >> > sharing
> > > > > >> > >> the
> > > > > >> > >> > > > same datadir
> > > > > >> > >> > > >
> > > > > >> > >> > > > All update handlers are unaccessible in the read-only
> > > > master
> > > > > >> > >> > > >
> > > > > >> > >> > > > <updateHandler class="solr.DirectUpdateHandler2"
> > > > > >> > >> > > >                 enable="${solr.can.write:true}">
> > > > > >> > >> > > >
> > > > > >> > >> > > > java -Dsolr.can.write=false .....
> > > > > >> > >> > > >
> > > > > >> > >> > > > And I can reload the index manually:
> > > > > >> > >> > > >
> > > > > >> > >> > > > curl "
> > > > > >> > >> > > >
> > > > > >> > >> > >
> > > > > >> > >> >
> > > > > >> > >>
> > > > > >> >
> > > > > >>
> > > > >
> > > >
> > >
> >
> http://localhost:5005/solr/admin/cores?wt=json&action=RELOAD&core=collection1
> > > > > >> > >> > > > "
> > > > > >> > >> > > >
> > > > > >> > >> > > > But this is not an ideal solution; I'd like for the
> > > > read-only
> > > > > >> > >> server to
> > > > > >> > >> > > > discover index changes on its own. Any pointers?
> > > > > >> > >> > > >
> > > > > >> > >> > > > Thanks,
> > > > > >> > >> > > >
> > > > > >> > >> > > >  roman
> > > > > >> > >> > > >
> > > > > >> > >> > > >
> > > > > >> > >> > > > On Tue, Jun 4, 2013 at 2:01 PM, Roman Chyla <
> > > > > >> > roman.ch...@gmail.com>
> > > > > >> > >> > > wrote:
> > > > > >> > >> > > >
> > > > > >> > >> > > >> Hello,
> > > > > >> > >> > > >>
> > > > > >> > >> > > >> I need your expert advice. I am thinking about
> running
> > > two
> > > > > >> > >> instances
> > > > > >> > >> > of
> > > > > >> > >> > > >> solr that share the same datadirectory. The *reason*
> > > > being:
> > > > > >> > >> indexing
> > > > > >> > >> > > >> instance is constantly building cache after every
> > commit
> > > > (we
> > > > > >> > have a
> > > > > >> > >> > big
> > > > > >> > >> > > >> cache) and this slows it down. But indexing doesn't
> > need
> > > > > much
> > > > > >> > RAM,
> > > > > >> > >> > only
> > > > > >> > >> > > the
> > > > > >> > >> > > >> search does (and server has lots of CPUs)
> > > > > >> > >> > > >>
> > > > > >> > >> > > >> So, it is like having two solr instances
> > > > > >> > >> > > >>
> > > > > >> > >> > > >> 1. solr-indexing-master
> > > > > >> > >> > > >> 2. solr-read-only-master
> > > > > >> > >> > > >>
> > > > > >> > >> > > >> In the solrconfig.xml I can disable update
> components,
> > > It
> > > > > >> should
> > > > > >> > be
> > > > > >> > >> > > fine.
> > > > > >> > >> > > >> However, I don't know how to 'trigger' index
> > re-opening
> > > on
> > > > > (2)
> > > > > >> > >> after
> > > > > >> > >> > the
> > > > > >> > >> > > >> commit happens on (1).
> > > > > >> > >> > > >>
> > > > > >> > >> > > >> Ideally, the second instance could monitor the disk
> > and
> > > > > >> re-open
> > > > > >> > >> disk
> > > > > >> > >> > > after
> > > > > >> > >> > > >> new files appear there. Do I have to implement
> custom
> > > > > >> > >> > > IndexReaderFactory?
> > > > > >> > >> > > >> Or something else?
> > > > > >> > >> > > >>
> > > > > >> > >> > > >> Please note: I know about the replication, this
> > usecase
> > > is
> > > > > >> IMHO
> > > > > >> > >> > slightly
> > > > > >> > >> > > >> different - in fact, write-only-master (1) is also a
> > > > > >> replication
> > > > > >> > >> > master
> > > > > >> > >> > > >>
> > > > > >> > >> > > >> Googling turned out only this
> > > > > >> > >> > > >>
> > > > > >> > >>
> > > > >
> http://comments.gmane.org/gmane.comp.jakarta.lucene.solr.user/71912-
> > > > > >> > >> > > no
> > > > > >> > >> > > >> pointers there.
> > > > > >> > >> > > >>
> > > > > >> > >> > > >> But If I am approaching the problem wrongly, please
> > > don't
> > > > > >> > hesitate
> > > > > >> > >> to
> > > > > >> > >> > > >> 're-educate' me :)
> > > > > >> > >> > > >>
> > > > > >> > >> > > >> Thanks!
> > > > > >> > >> > > >>
> > > > > >> > >> > > >>  roman
> > > > > >> > >> > > >>
> > > > > >> > >> > >
> > > > > >> > >> > >
> > > > > >> > >> >
> > > > > >> > >>
> > > > > >> > >
> > > > > >> > >
> > > > > >> >
> > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to