Re: Swapping indexes on disk

Erick Erickson Wed, 14 Jun 2017 16:41:33 -0700

Why not just use the replication API and fetchindex? See:
https://cwiki.apache.org/confluence/display/solr/Index+Replication#IndexReplication-HTTPAPICommandsfortheReplicationHandler.



It's not entirely obvious from the writeup, but you can specify
masterUrl as part of the command &masterUrl=some_other_solr_core.

So you have your "live" core, and your just-indexed core, call it "new".

You issue the fetchindex command to core on live and masterUrl points
to "new". It'll look something like
http://liveserver:8983/solr/live_core/replication?command=fetchindex&masterUrl=http://newserver:8983/solr/new_core

Solr will
1> copy the index from new to live
2> once that's done, open a new searcher and you're searching live
documents (after any autowarming you've configured).
3> delete the old index

Note that until <2>, incoming searches are served by the old index. So
the user sees no service interruptions at all.

If for any reason the fetch fails, the old index is left intact.

This will require that you have enough disk space to temporarily have
both the old and new index on the live server and there may be some
extra memory consumed while the old searcher is still open and the new
one is autowarming.

Best,
Erick

On Wed, Jun 14, 2017 at 4:10 PM, Mike Lissner
<mliss...@michaeljaylissner.com> wrote:
> I figured Solr would have a native system built in, but since we don't use
> it already, I didn't want to learn all of its ins and outs just for this
> disk situation.
>
> Ditto, essentially, applies for the swapping strategy. We don't have a Solr
> expert, just me, a generalist, and sorting out these kinds of things can
> take a while. The hope was to avoid that kind of complication with some
> clever use of symlinks and minor downtime. Our front end has a retry
> mechanism, so if solr is down for less than a minute, users will just have
> delayed responses, which is fine.
>
> The new strategy is to rsync the files while solr is live, stop solr, do a
> rsync diff, then start solr again. That'll give a bit for bit copy with
> very little downtime — it's the strategy postgres recommends for disk-based
> backups, so it seems like a safer bet. We needed a re-index anyway due to
> schema changes, which my first attempt included, but I guess that'll have
> to wait.
>
> Thanks for the replies. If anybody can explain why the first strategy
> failed, I'd still be interested in learning.
>
> Mike
>
> On Wed, Jun 14, 2017 at 12:09 PM Chris Ulicny <culicny@iq.media> wrote:
>
>> Are you physically swapping the disks to introduce the new index? Or having
>> both disks mounted at the same time?
>>
>> If the disks are simultaneously available, can you just swap the cores and
>> then delete the core on the old disk?
>>
>> https://cwiki.apache.org/confluence/display/solr/CoreAdmin+API#CoreAdminAPI-SWAP
>>
>> We periodically move cores to different drives using solr's replication
>> functionality and core swapping (after stopping replication). However, I've
>> never encountered solr deleting an index like that.
>>
>>
>>
>> On Wed, Jun 14, 2017 at 2:48 PM David Hastings <
>> hastings.recurs...@gmail.com>
>> wrote:
>>
>> > I dont have an answer to why the folder got cleared, however i am
>> wondering
>> > why you arent using basic replication to do this exact same thing, since
>> > solr will natively take care of all this for you with no interruption to
>> > the user and no stop/start routines etc.
>> >
>> > On Wed, Jun 14, 2017 at 2:26 PM, Mike Lissner <
>> > mliss...@michaeljaylissner.com> wrote:
>> >
>> > > We are replacing a drive mounted at /old with one mounted at /new. Our
>> > > index currently lives on /old, and our plan was to:
>> > >
>> > > 1. Create a new index on /new
>> > > 2. Reindex from our database so that the new index on /new is properly
>> > > populated.
>> > > 3. Stop solr.
>> > > 4. Symlink /old to /new (Solr now looks for the index at /old/solr,
>> which
>> > > redirects to /new/solr)
>> > > 5. Start solr
>> > > 6. (Later) Stop solr, swap the drives (old for new), and start solr.
>> > (Solr
>> > > now looks for the index at /old/solr again, and finds it there.)
>> > > 7. Delete the index pointing to /new created in step 1.
>> > >
>> > > The idea was that this would create a new index for solr, would
>> populate
>> > it
>> > > with the right content, and would avoid having to touch our existing
>> solr
>> > > configurations aside from creating one new index, which we could soon
>> > > delete.
>> > >
>> > > I just did steps 1-5, but I got null pointer exceptions when starting
>> > solr,
>> > > and it appears that the index on /new has been almost completely
>> deleted
>> > by
>> > > Solr (this is a bummer, since it takes days to populate).
>> > >
>> > > Is this expected? Am I terribly crazy to try to swap indexes on disk?
>> As
>> > > far as I know, the only difference between the indexes is their name.
>> > >
>> > > We're using Solr version 4.10.4.
>> > >
>> > > Thank you,
>> > >
>> > > Mike
>> > >
>> >
>>

Re: Swapping indexes on disk

Reply via email to