On 2/4/08, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> On Feb 4, 2008 2:20 PM, Rachel McConnell <[EMAIL PROTECTED]> wrote:
> > > If you are running snapshooter asynchronously, this would be the cause.
> > > It's designed to be run from solr (via a postCommit or postOptimize
> > > hook) at specific points where a consistent view of the index is
> > > available.
> >
> > So our cron job might be running DURING an update, for example, and
> > get duplicate values that way?
>
> Right.  Duplicates are removed on a commit(), so if a snapshot is
> being taken at any other time than right after a commit, those deletes
> will not have been performed.

I've reviewed the wiki pages about snappuller
(http://wiki.apache.org/solr/SolrCollectionDistributionScripts) and
solrconfig.xml (http://wiki.apache.org/solr/SolrConfigXml) and it
seems that the snappuller is intended to be used on the slave server.
In our case, the slave servers do no updating and never commit; the
master is the only one that commits.  Is there a standard way for the
just-committed, consistent index to be pushed from the master server
out to the slaves?

In fact I don't see how this is supposed to work in any environment
where the master and slave Solr servers are on different physical
machines.  The postCommit handler should run after a commit, which
only happens on the master server; yet it runs snappuller which should
run on a slave.  I am probably missing something here, is there any
more documentation you can point me to?

Rachel

Reply via email to